Commit Graph

15 Commits

Author SHA1 Message Date
Eugen Rochko
0762258aec
Fix hashtags being split by ZWNJ character (#11821)
Fix #11761
2019-09-13 16:01:26 +02:00
Eugen Rochko
cc0a55cf9a
Add more accurate hashtag search (#11579)
* Add more accurate hashtag search

Using ElasticSearch to index hashtags with edge n-grams and score
them by usage within the last 7 days since last activity. Only
hashtags that have been reviewed and are listable can appear in
searches, unless they match the query exactly

* Fix search analyzer dropping non-ascii characters
2019-08-18 03:45:51 +02:00
Eugen Rochko
92de439c04
Change hashtag search to only return results that have trended in the past (#11448)
* Change hashtag search to only return results that have trended in the past

A way to eliminate typos and other one-off "junk" results

* Fix excluding exact matches that don't have a score

* Fix tests
2019-07-30 20:29:50 +02:00
Eugen Rochko
e136112ab7
Fix tag normalization and migration not removing duplicate tags (#11441)
Fix #11428
2019-07-29 20:40:21 +02:00
ThibG
c37c1da41e Disallow numeric-only hashtags (#11363)
* Add spec covering numeric-only hashtags

* Fix hashtag regex
2019-07-19 23:22:35 +02:00
Eugen Rochko
84e988479e
Fix only one middle dot being recognized in hashtags (#11345)
Fix #10934
2019-07-18 03:02:56 +02:00
ysksn
7d7df877ef Add a test for Tag#to_param (#5705) 2017-11-15 16:04:41 +01:00
masarakki
a49be27145 add validation to tag name (#4194) 2017-07-14 11:02:49 +02:00
unarist
c2753fdfb4 Make tag search case insensitive again (#4184) 2017-07-13 19:31:33 +02:00
Eugen Rochko
3f59238207 Add important test for full-width hashtags (#3911) 2017-06-23 17:01:53 +02:00
unarist
004672aa6c Fix tag search order and not to use tsvector (#3611)
* Sort results by the name
* Switch search method to simple `LIKE` matching instead of tsvector/tsquery

Previously we used scores from ts_rank_cd() to sort results, but it didn't work
because the function returns same score for all results. It's not for calculate
similarity of single words. Sometimes this bug even push out exact matching tag
from results.

Additionally, PostgreSQL supports prefix searching with standard btree index.
Using it offers simpler code, but also less index size and some speed.
2017-06-06 16:07:06 +02:00
Matt Jankowski
388ec0d5b6 Search cleanup (#1333)
* Clean up SQL output in Tag and Account search methods

* Add basic coverage for Tag.search_for

* Add coverage for Account.search_for

* Add coverage for Account.advanced_search_for
2017-04-09 14:45:01 +02:00
Eugen Rochko
4fb95c91fb Fix wrongful matching of last period in extended usernames
Fix anchor tags in some wikipedia URLs being matches as a hashtag
2017-03-05 18:08:19 +01:00
Eugen Rochko
546c4718e7 Localizations for most server-side strings 2016-11-16 00:55:33 +01:00
Eugen Rochko
62292797ec Adding hashtag model 2016-11-04 19:12:59 +01:00