Ditto for ending with \b.
Consider muting the phrase "(hot take)". I stipulate it is reasonable
to enter this with the default "match whole word" behavior. Under the
old behavior, this would be encoded as
\b\(hot\ take\)\b
However, if \b is before the first character in the string and the first
character in the string is not a word character, then the match will
fail. Ditto for after. In our example, "(" is not a word character, so
this will not match statuses containing "(hot take)", and that's a very
surprising behavior.
To address this, we only add leading and trailing \b to keywords that
start or end with word characters.
There are two motivations for this:
1. It looks like we're going to add other features that require
server-side storage (e.g. user notes).
2. Namespacing glitchsoc modifications is a good idea anyway: even if we
do not end up doing (1), if upstream introduces a keyword-mute feature
that also uses a "KeywordMute" model, we can avoid some merge
conflicts this way and work on the more interesting task of
choosing which implementation to use.
Word-boundary matching only works as intended in English and languages
that use similar word-breaking characters; it doesn't work so well in
(say) Japanese, Chinese, or Thai. It's unacceptable to have a feature
that doesn't work as intended for some languages. (Moreso especially
considering that it's likely that the largest contingent on the Mastodon
bit of the fediverse speaks Japanese.)
There are rules specified in Unicode TR29[1] for word-breaking across
all languages supported by Unicode, but the rules deliberately do not
cover all cases. In fact, TR29 states
For example, reliable detection of word boundaries in languages such
as Thai, Lao, Chinese, or Japanese requires the use of dictionary
lookup, analogous to English hyphenation.
So we aren't going to be able to make word detection work with regexes
within Mastodon (or glitchsoc). However, for a first pass (even if it's
kind of punting) we can allow the user to choose whether they want word
or substring detection and warn about the limitations of this
implementation in, say, docs.
[1]: https://unicode.org/reports/tr29/https://web.archive.org/web/20171001005125/https://unicode.org/reports/tr29/
This should eventually be accessible via the API and the web frontend,
but I find it easier to set up an editing interface using Rails
templates and the like. We can always take it out if it turns out we
don't need it.
The intent of the previous concatenation was to minimize object
allocations, which can end up being a slow killer. However, it turns
out that under MRI 2.4.x, the shove-strings-in-an-array-and-join method
is not only arguably more common but (in this particular case) actually
allocates *fewer* objects than the string concatenation.
Or, at least, that's what I gather by running this:
words = %w(palmettoes nudged hibernation bullish stockade's tightened Hades
Dixie's formalize superego's commissaries Zappa's viceroy's apothecaries
tablespoonful's barons Chennai tollgate ticked expands)
a = Account.first
KeywordMute.transaction do
words.each { |w| KeywordMute.create!(keyword: w, account: a) }
GC.start
s1 = GC.stat
re = String.new.tap do |str|
scoped = KeywordMute.where(account: a)
keywords = scoped.select(:id, :keyword)
count = scoped.count
keywords.find_each.with_index do |kw, index|
str << Regexp.escape(kw.keyword.strip)
str << '|' if index < count - 1
end
end
s2 = GC.stat
puts s1.inspect, s2.inspect
raise ActiveRecord::Rollback
end
vs this:
words = %w( palmettoes nudged hibernation bullish stockade's tightened Hades Dixie's
formalize superego's commissaries Zappa's viceroy's apothecaries tablespoonful's
barons Chennai tollgate ticked expands
)
a = Account.first
KeywordMute.transaction do
words.each { |w| KeywordMute.create!(keyword: w, account: a) }
GC.start
s1 = GC.stat
re = [].tap do |arr|
KeywordMute.where(account: a).select(:keyword, :id).find_each do |m|
arr << Regexp.escape(m.keyword.strip)
end
end.join('|')
s2 = GC.stat
puts s1.inspect, s2.inspect
raise ActiveRecord::Rollback
end
Using rails r, here is a comparison of the total_allocated_objects and
malloc_increase_bytes GC stat data:
total_allocated_objects malloc_increase_bytes
string concat 3200241 -> 3201428 (+1187) 1176 -> 45216 (44040)
array join 3200380 -> 3201299 (+919) 1176 -> 36448 (35272)
It would also have been valid to get rid of the attr_reader, but I like
being able to reach inside KeywordMute::Matcher without resorting to
instance_variable_get tomfoolery.
A matcher object that builds a match from KeywordMute data and runs it
over text is, in my view, one of the easier ways to write examples for
this sort of thing.
Gist of the proposed keyword mute implementation:
Keyword mutes are represented server-side as one keyword per record.
For each account, there exists a keyword regex that is generated as one
big alternation of all keywords. This regex is cached (in Redis, I
guess) so we can quickly get it when filtering in FeedManager.
On desktop, the compose text box grows to accommodate the content. On
mobile, the text box does not grow to accommodate text context, but does
grow to accommodate images. It is possible in both cases to overflow
the available area, which makes accessing other UI elements (e.g.
visibility setttings) difficult.
This commit makes the compose area optionally scrollable, which allows
those UI elements to remain available even if they go off-screen.
Commit 6e54719474 moved the Mastodon
variables and mixins deeper in the directory hierarchy; this commit
brings the glitch components in line with that change.
* Swedish file added
* Swedish file added
* Swedish file updated
* Swedish languagefile added
* Add Swedish translation
* Add Swedish translation
* Started the Swedish translation
* Added Swedish lang settings
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Swedish language completed and added
* Swedish language Simple_form added
* Swedish language Divise added
* Swedish language doorkeeper added
* Swedish language - now all file complete
* Swedish - Typos and supplementation in sentence structure
* Update simple_form.sv.yml
* Update sv.yml
* Update sv.yml
Rearranged the alphabetical order.
Specifically, this fixes status length calculation to be same as JS side.
BTW, since this pattern used in not only preview card fetching, we
should extract it (with twitter-regex?) and write tests I think.
* Clean up reblog-tracking sets from FeedManager
Builds on #5419, with a few minor optimizations and cleanup of sets
after they are no longer needed.
* Update tests, fix multiply-reblogged case
Previously, we would have lost the fact that a given status was
reblogged if the displayed reblog of it was removed, now we don't.
Also added tests to make sure FeedManager#trim cleans up our reblog
tracking keys, fixed up FeedCleanupScheduler to use the right loop,
and fixed the test for it.
* Swedish file added
* Swedish file added
* Swedish file updated
* Swedish languagefile added
* Add Swedish translation
* Add Swedish translation
* Started the Swedish translation
* Added Swedish lang settings
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Updating Swedish language
* Swedish language completed and added
* Swedish language Simple_form added
* Swedish language Divise added
* Swedish language doorkeeper added
* Swedish language - now all file complete
* Keep references to all reblogs of a status on home feed
When inserting reblog: Add to set of reblogs of this status on
the feed, if original status was present in the feed, add it to
that set as well.
When removing a reblog: Remove it from that set. Take random
remaining item from the set. If one exists, re-insert it into feed,
otherwise do not re-insert anything.
Fix#4210
* When original is removed, toss out reblog references
Fix#5398
Ordering the home timeline query by account_id meant that the first
100 items belonged to a single account. There was also no reason to
reverse-iterate over the statuses. Assuming the user accesses the
feed halfway-through, it's better to have recent statuses already
available at the top. Therefore working from newer->older is ideal.
If the algorithm ends up filtering all items out during last-mile
filtering, repeat again a page further. The algorithm terminates
when either at least one item has been added, or if the database
query returns nothing (end of data reached)
We've changed un-reblogging behavior when we implement Snowflake, to insert un-reblogged status at the position reblogging status existed.
However, our API expects home timeline is ordered by status ids, and max_id/since_id filters by zset score. Due to this, un-reblogged status appears as a last item of result set, and timeline expansion may skips many statuses.
So this reverts that change...reblogged status inserted at corresponding position to its id.
* Add option to reduce motion
* Use HOC to wrap all Motion calls
* fix case-sensitive issue
* Avoid updating too frequently
* Get rid of unnecessary change to _simple_status.html.haml
* Fix some doodle bugs and added Background color functionality
* added protections against accidental doodle erase, screen size changing
* resolve react warning about 'selected' on <option>
The main change of this PR is removing `order by visibility` hack.
This was introduced to force using of `index_statuses_on_account_id` instead of PK index, but it seems no longer needed probably due to `index_statuses_on_account_id_id`. Removing this avoids reading all rows, so really improves first fetching of the user who has lot of statuses.
I have also changed JOIN to IN + subquery, which slightly faster in most cases.
- For some reason, :if option on before_action did not work. It got
executed every time, returned false, and the action run anyway,
which led to the current_sign_in_at and sign_in_count being
updated on every request
- Return "do not filter" early in FeedManager#filter_from_home? if
the status is authored by receiver. Usually this method is not
called for own statuses at all, but it is called when Feed#get
uses the database
- Return early if #reload_stale_associations! has nothing to load
to save a database query with WHERE 1=0
Do NOT send "delete" through streaming API when unmerging from
home timeline. "delete" implies that the original status was
deleted, which is not true!
Remote ActivityPub users that have never been known as OStatus users
(whether or not they support it) will not have a “remote_url” attribute
set. In case they reside on an instance with WEB_DOMAIN ≠ LOCAL_DOMAIN,
the current check did rely on “remote_url” to verify the user's domain.
* Retoot count increases without reason
-The store_uri method for Statuses was being called on after_create and causing reblogs to be incremented twice.
-This calls it when the transaction is finished by using after_create_commit.
-Fixes #4916.
* Added test case for after_create_commit callback for checking reblog count.
* Rewrote test to keep original, but added one for only the after_create_commit callback.
* If OEmbed response doesn't have a required property `type`, ignore it.
e.g. `NoMethodError: undefined method 'type' for ...`
* If we failed to detect encoding, fallback to default behavior of Nokogiri.
e.g. `KeyError: key not found: :encoding`
* Ajout du support des thèmes multiples
Ajoute des traductions pour les nouvelles chaînes permettant le support de thèmes multiples.
Add translations for the new strings allowing support for multiple themes.
* Mise à jour de la traduction
Met à jour les chaînes modifiées et ajoute des traductions pour celle n’en ayant pas.
Update modified strings and add new translations for the ones who are missing them.
* Remplace « ' » par « ’ »
Retire de la traduction les apostrophes droites « ' » (U+0027) au profit des apostrophes typographiques « ’ » (U+2019).
En typographie française, les apostrophes typographiques sont utilisées à la place des apostrophes droites. La traduction était incohérente et utilisait les deux.
Remove from the translation all the vertical apostrophes (U+0027) in favor of the curly ones (U+2019).
In French typography, typographic apostrophes are used instead of vertical ones. The translation was incoherent and used both.
When ancestors get loaded, we scroll to the target status (i.e. skip
ancestors). However, ancestors may get loaded before the status itself,
then it causes TypeError because `this.node` is undefined yet.
Since we don't show anything until the status gets loaded, we don't need
to scroll to the target status in this time. If we get the status itslef
later, it causes `componentDidUpdate` and scrolling correctly.
* Add foreign key constraint to column `account` in `account_moderation_notes`
* Change account_id and target_account_id to non-nullable in account_moderation_notes
* Add dependent: :destroy to account and target_account in account_moderation_notes
* Track frequently used emojis in web UI
* Persist emoji usage, but debounce commits to the settings API
* Fix#5144 - Add tooltips to picker
* Display only 2 lines of frequently used emojis
- Rename Mastodon::TimestampIds into Mastodon::Snowflake for clarity
- Skip for statuses coming from inbox, aka delivered in real-time
- Skip for statuses that claim to be from the future