PDF rausgenommen

This commit is contained in:
aschwarz
2023-01-23 11:03:31 +01:00
parent 82d562a322
commit a6523903eb
28078 changed files with 4247552 additions and 2 deletions

View File

@ -0,0 +1,13 @@
# Contributing
To add a new referrer spammer to the list, [click here to edit the spammers.txt file](https://github.com/matomo-org/referrer-spam-blacklist/edit/master/spammers.txt) and create a pull request. Alternatively you can create a [new issue](https://github.com/matomo-org/referrer-spam-blacklist/issues/new).
If you open a pull request, please:
- **add one new domain per pull request**
- explain where the referrer domain appeared and why you think it is a spammer
- name the pull request in the format `Add xxx.yyy` so that it's easy to manage duplicates (for example `Add cyber-monday.ga`)
- keep the list ordered alphabetically
- use [Linux line endings](http://en.wikipedia.org/wiki/Newline)
Please [search](https://github.com/matomo-org/referrer-spam-blacklist/issues?utf8=%E2%9C%93&q=is%3Aopen+) if somebody already reported the host before opening a new one.

View File

@ -0,0 +1,92 @@
This is a community-contributed list of [referrer spammers](http://en.wikipedia.org/wiki/Referer_spam) maintained by [Matomo](https://matomo.org/), the leading open source web analytics platform.
## Usage
The list is stored in this repository in `spammers.txt`. This text file contains one host per line.
You can [download this file manually](https://github.com/matomo-org/referrer-spam-blacklist/blob/master/spammers.txt), download the [whole folder as zip](https://github.com/matomo-org/referrer-spam-blacklist/archive/master.zip) or clone the repository using git:
```
git clone https://github.com/matomo-org/referrer-spam-blacklist.git
```
### PHP
If you are using PHP, you can also install the list through Composer:
```
composer require matomo/referrer-spam-blacklist
```
Parsing the file should be pretty easy using your favorite language. Beware that the file can contain empty lines.
Here is an example using PHP:
```php
$list = file('spammers.txt', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES);
```
### Nginx
Nginx's `server` block can be configured to check the referer and return an error:
```nginx
if ($http_referer ~ '0n-line.tv') {return 403;}
if ($http_referer ~ '100dollars-seo.com') {return 403;}
...
```
When combined, list exceeds the max length for a single regex expression, so hosts must be broken up as shown above.
Here is a bash script to create an nginx conf file:
```bash
sort spammers.txt | uniq | sed 's/\./\\\\./g' | while read host;
do
echo "if (\$http_referer ~ '$host') {return 403;}" >> /etc/nginx/referer_spam.conf
done;
```
you would then `include /etc/nginx/referer_spam.conf;` inside your `server` block
Now as a daily cron job so the list stays up to date:
```bash
0 0 * * * cd /etc/nginx/referrer-spam-blacklist/ && git pull > /dev/null && echo "" > /etc/nginx/referer_spam.conf && sort spammers.txt | uniq | sed 's/\./\\\\\\\\./g' | while read host; do echo "if (\$http_referer ~ '$host') {return 403;}" >> /etc/nginx/referer_spam.conf; done; service nginx reload > /dev/null
```
### In Matomo (formerly Piwik)
This list is included in each [Matomo](https://matomo.org) release so that referrer spam is filtered automatically. Matomo will also automatically update this list to its latest version every week.
## Contributing
To add a new referrer spammer to the list, [click here to edit the spammers.txt file](https://github.com/matomo-org/referrer-spam-blacklist/edit/master/spammers.txt) and select `Create a new branch for this commit and start a pull request. `. Alternatively you can create a [new issue](https://github.com/matomo-org/referrer-spam-blacklist/issues/new). In your issue or pull request please explain where the referrer domain appeared and why you think it is a spammer. **Please open one pull request per new domain**.
If you open a pull request, it is appreciated if you keep one hostname per line, keep the list ordered alphabetically, and use [Linux line endings](http://en.wikipedia.org/wiki/Newline).
Please [search](https://github.com/matomo-org/referrer-spam-blacklist/issues) if somebody already reported the host before opening a new one.
### Subdomains
Matomo does sub-string matching on domain names from this list, so adding `semalt.com` is enough to block all subdomain referrers too, such as `semalt.semalt.com`.
However, there are cases where you'd only want to add a subdomain but not the root domain. For example, add `referrerspammer.tumblr.com` but not `tumblr.com`, otherwise all `*.tumblr.com` sites would be affected.
### Sorting
To keep the list sorted the same way across forks it is recommended to let the computer do the sorting. The list follows the merge sort algorithm as implemented in [sort](https://en.wikipedia.org/wiki/Sort_(Unix)). You can use sort to both sort the list and filter out doubles:
```
sort -uf -o spammers.txt spammers.txt
```
### Community Projects
[Apache .htaccess referrer spam blacklist](https://github.com/kambrium/apache-referrer-spam-blacklist) - A script for Apache users that generates a list of RewriteConds based on `spammers.txt`.
## Disclaimer
This list of Referrer spammers is contributed by the community and is provided as is. Use at your own discretion: it may be incomplete (although we aim to keep it up to date) and it may contain outdated entries (let us know if a hostname was added but is not actually a spammer).
## License
Public Domain (no copyright).

View File

@ -0,0 +1,8 @@
{
"name": "matomo/referrer-spam-blacklist",
"description": "Community-contributed list of referrer spammers",
"license": "CC0-1.0",
"replace": {
"piwik/referrer-spam-blacklist":"*"
}
}

View File

@ -0,0 +1,926 @@
03e.info
0n-line.tv
1-99seo.com
1-free-share-buttons.com
100dollars-seo.com
100searchengines.com
12masterov.com
12u.info
1pamm.ru
1webmaster.ml
24x7-server-support.site
2your.site
3-letter-domains.net
3waynetworks.com
4inn.ru
4istoshop.com
4webmasters.org
5-steps-to-start-business.com
5forex.ru
6hopping.com
7kop.ru
7makemoneyonline.com
7zap.com
abcdefh.xyz
abcdeg.xyz
abclauncher.com
acads.net
acarreo.ru
acunetix-referrer.com
adanih.com
adcash.com
adf.ly
adspart.com
adtiger.tk
adventureparkcostarica.com
adviceforum.info
advokateg.xyz
affordablewebsitesandmobileapps.com
afora.ru
aibolita.com
aidarmebel.kz
akuhni.by
alfabot.xyz
alibestsale.com
aliexsale.ru
alinabaniecka.pl
alkanfarma.org
allergick.com
allergija.com
allknow.info
allmarketsnewdayli.gdn
allnews.md
allnews24.in
allwomen.info
allwrighter.ru
alpharma.net
altermix.ua
amazon-seo-service.com
amt-k.ru
amtel-vredestein.com
anal-acrobats.hol.es
analytics-ads.xyz
anapa-inns.ru
android-style.com
animalphotos.xyz
animenime.ru
anticrawler.org
antiguabarbuda.ru
apteka-pharm.ru
arendakvartir.kz
arendovalka.xyz
arkkivoltti.net
artpaint-market.ru
artparquet.ru
aruplighting.com
ask-yug.com
atleticpharm.org
atyks.ru
auto-complex.by
auto-kia-fulldrive.ru
autoblog.org.ua
autoseo-traffic.com
autovideobroadcast.com
aviva-limoux.com
avkzarabotok.info
avtovykup.kz
azartclub.org
azbukafree.com
azlex.uz
baixar-musicas-gratis.com
baladur.ru
balitouroffice.com
balkanfarma.org
bard-real.com.ua
batut-fun.ru
bavariagid.de
beachtoday.ru
bedroomlighting.us
beremenyashka.com
best-deal-hdd.pro
best-ping-service-usa.blue
best-seo-offer.com
best-seo-software.xyz
best-seo-solution.com
bestmobilityscooterstoday.com
bestofferhddbyt.info
bestofferhddeed.info
bestwebsitesawards.com
betterhealthbeauty.com
bezprostatita.com
bif-ru.info
biglistofwebsites.com
billiard-classic.com.ua
bio-market.kz
biplanecentre.ru
bird1.ru
biteg.xyz
bizru.info
black-friday.ga
blackhatworth.com
blog100.org
blogtotal.de
blue-square.biz
bluerobot.info
boltalko.xyz
boostmyppc.com
bpro1.top
brakehawk.com
brateg.xyz
break-the-chains.com
brillianty.info
brk-rti.ru
brothers-smaller.ru
brusilov.ru
bsell.ru
budilneg.xyz
budmavtomatika.com.ua
bufetout.ru
buketeg.xyz
bukleteg.xyz
burger-imperia.com
burn-fat.ga
buttons-for-website.com
buttons-for-your-website.com
buy-cheap-online.info
buy-cheap-pills-order-online.com
buy-forum.ru
buy-meds24.com
call-of-duty.info
cardiosport.com.ua
cartechnic.ru
cenokos.ru
cenoval.ru
cezartabac.ro
chcu.net
cheap-trusted-backlinks.com
chelyabinsk.dienai.ru
chinese-amezon.com
chizhik-2.ru
ci.ua
cityadspix.com
civilwartheater.com
cleaningservices.kiev.ua
clicksor.com
climate.by
club-lukojl.ru
coderstate.com
codysbbq.com
coffeemashiny.ru
columb.net.ua
commerage.ru
comp-pomosch.ru
compliance-alex.xyz
compliance-alexa.xyz
compliance-andrew.xyz
compliance-barak.xyz
compliance-brian.xyz
compliance-don.xyz
compliance-donald.xyz
compliance-elena.xyz
compliance-fred.xyz
compliance-george.xyz
compliance-irvin.xyz
compliance-ivan.xyz
compliance-john.top
compliance-julianna.top
computer-remont.ru
conciergegroup.org
connectikastudio.com
cookie-law-enforcement-aa.xyz
cookie-law-enforcement-bb.xyz
cookie-law-enforcement-cc.xyz
cookie-law-enforcement-dd.xyz
cookie-law-enforcement-ee.xyz
cookie-law-enforcement-ff.xyz
cookie-law-enforcement-gg.xyz
cookie-law-enforcement-hh.xyz
cookie-law-enforcement-ii.xyz
cookie-law-enforcement-jj.xyz
cookie-law-enforcement-kk.xyz
cookie-law-enforcement-ll.xyz
cookie-law-enforcement-mm.xyz
cookie-law-enforcement-nn.xyz
cookie-law-enforcement-oo.xyz
cookie-law-enforcement-pp.xyz
cookie-law-enforcement-qq.xyz
cookie-law-enforcement-rr.xyz
cookie-law-enforcement-ss.xyz
cookie-law-enforcement-tt.xyz
cookie-law-enforcement-uu.xyz
cookie-law-enforcement-vv.xyz
cookie-law-enforcement-ww.xyz
cookie-law-enforcement-xx.xyz
cookie-law-enforcement-yy.xyz
cookie-law-enforcement-zz.xyz
copyrightclaims.org
copyrightinstitute.org
covadhosting.biz
cp24.com.ua
cubook.supernew.org
customsua.com.ua
cyber-monday.ga
dailyrank.net
darodar.com
dawlenie.com
dbutton.net
dcdcapital.com
deart-13.ru
delfin-aqua.com.ua
demenageur.com
dengi-v-kredit.in.ua
dermatovenerologiya.com
descargar-musica-gratis.net
detskie-konstruktory.ru
dev-seo.blog
dienai.ru
diplomas-ru.com
dipstar.org
distonija.com
dividendo.ru
djekxa.ru
djonwatch.ru
dktr.ru
docs4all.com
docsarchive.net
docsportal.net
documentbase.net
documentserver.net
documentsite.net
dogsrun.net
dojki-hd.com
domain-tracker.com
domashniy-hotel.ru
dominateforex.ml
domination.ml
doska-vsem.ru
dostavka-v-krym.com
dosugrostov.site
drupa.com
dvr.biz.ua
e-buyeasy.com
e-commerce-seo.com
e-commerce-seo1.com
earn-from-articles.com
earnity-money.info
easycommerce.cf
ecommerce-seo.org
ecomp3.ru
econom.co
edakgfvwql.ru
edudocs.net
eduinfosite.com
eduserver.net
egovaleo.it
ek-invest.ru
ekatalog.xyz
eko-gazon.ru
ekoproekt-kr.ru
ekto.ee
elektrikovich.ru
elementspluss.ru
elentur.com.ua
elmifarhangi.com
elvel.com.ua
emerson-rus.ru
eric-artem.com
erot.co
escort-russian.com
este-line.com.ua
etairikavideo.gr
etehnika.com.ua
eu-cookie-law-enforcement2.xyz
euromasterclass.ru
europages.com.ru
eurosamodelki.ru
event-tracking.com
exdocsfiles.com
express-vyvoz.ru
eyes-on-you.ga
f1nder.org
fanoboi.com
fast-wordpress-start.com
fbdownloader.com
feminist.org.ua
fidalsa.de
filesclub.net
filesdatabase.net
filter-ot-zheleza.ru
finansov.info
findercarphotos.com
fix-website-errors.com
floating-share-buttons.com
flowertherapy.ru
for-your.website
forex-procto.ru
forsex.info
fortwosmartcar.pw
forum69.info
foxweber.com
frauplus.ru
free-fb-traffic.com
free-fbook-traffic.com
free-floating-buttons.com
free-share-buttons.com
free-social-buttons.com
free-social-buttons.xyz
free-social-buttons7.xyz
free-traffic.xyz
free-video-tool.com
free-website-traffic.com
freenode.info
freewhatsappload.com
freewlan.info
freshnails.com.ua
fsalas.com
game300.ru
gandikapper.ru
gearcraft.us
gearsadspromo.club
generalporn.org
germes-trans.com
get-clickize.info
get-free-social-traffic.com
get-free-traffic-now.com
get-more-freeer-visitors.info
get-more-freeish-visitors.info
get-your-social-buttons.info
getaadsincome.info
getadsincomely.info
getlamborghini.ga
getrichquick.ml
getrichquickly.info
ghazel.ru
ghostvisitor.com
giftbig.ru
girlporn.ru
gkvector.ru
glavprofit.ru
global-smm.ru
gobongo.info
goodhumor24.com
goodprotein.ru
google-liar.ru
googlemare.com
googlsucks.com
gorgaz.info
guardlink.org
guidetopetersburg.com
handicapvantoday.com
happysong.ru
hard-porn.mobi
havepussy.com
hawaiisurf.com
hdmoviecamera.net
hdmoviecams.com
healbio.ru
healgastro.com
homeafrikalike.tk
homemypicture.tk
hongfanji.com
hosting-tracker.com
hottour.com
housediz.com
housemilan.ru
howopen.ru
howtostopreferralspam.eu
hoztorg-opt.ru
hseipaa.kz
hulfingtonpost.com
humanorightswatch.org
hundejo.com
hvd-store.com
hyip-zanoza.me
ico.re
igadgetsworld.com
igru-xbox.net
ilikevitaly.com
iloveitaly.ro
iloveitaly.ru
ilovevitaly.co
ilovevitaly.com
ilovevitaly.info
ilovevitaly.org
ilovevitaly.ru
ilovevitaly.xyz
iminent.com
imperiafilm.ru
impotentik.com
incitystroy.ru
incomekey.net
increasewwwtraffic.info
inet-shop.su
infektsii.com
infodocsportal.com
inform-ua.info
insider.pro
interferencer.ru
intex-air.ru
investpamm.ru
iskalko.ru
isotoner.com
ispaniya-costa-blanca.ru
it-max.com.ua
izhstrelok.ru
jjbabskoe.ru
jobius.com.ua
jumkite.com
justkillingti.me
justprofit.xyz
kabbalah-red-bracelets.com
kabinet-binbank.ru
kabinet-card-5ka.ru
kabinet-click-alfabank.ru
kabinet-lk-megafon.ru
kabinet-login-mts.ru
kabinet-mil.ru
kabinet-mos.ru
kabinet-my-beeline.ru
kabinet-my-pochtabank.ru
kabinet-online-vtb.ru
kabinet-tinkoff.ru
kabinet-ttk.ru
kambasoft.com
kamin-sam.ru
karapuz.org.ua
kazka.ru
kazrent.com
kerch.site
keywords-monitoring-success.com
keywords-monitoring-your-success.com
kharkov.ua
kino-fun.ru
kino-key.info
kino2018.cc
kinobum.org
kinopolet.net
kinosed.net
knigonosha.net
komp-pomosch.ru
komputers-best.ru
komukc.com.ua
konkursov.net
kozhasobak.com
krasnodar-avtolombard.ru
kredytbank.com.ua
laminat.com.ua
landliver.org
landoftracking.com
laptop-4-less.com
law-check-two.xyz
law-enforcement-bot-ff.xyz
law-enforcement-check-three.xyz
law-enforcement-ee.xyz
law-six.xyz
laxdrills.com
leeboyrussia.com
legalrc.biz
lerporn.info
leto-dacha.ru
lipidofobia.com.br
littleberry.ru
livefixer.com
livia-pache.ru
livingroomdecoratingideas.website
lk-gosuslugi.ru
login-tinkoff.ru
loveorganic.ch
lsex.xyz
luckybull.io
lukoilcard.ru
lumb.co
luton-invest.ru
luxup.ru
magicdiet.gq
magnetic-bracelets.ru
makemoneyonline.com
makeprogress.ga
manimpotence.com
manualterap.roleforum.ru
marblestyle.ru
maridan.com.ua
marketland.ml
masterseek.com
matras.space
mattgibson.us
max-apprais.com
maxxximoda.ru
mebel-iz-dereva.kiev.ua
mebelcomplekt.ru
mebeldekor.com.ua
med-dopomoga.com
med-zdorovie.com.ua
medicineseasybuy.com
meds-online24.com
meduza-consult.ru
megapolis-96.ru
metallo-konstruktsii.ru
metallosajding.ru
mifepriston.net
mikozstop.com
mikrocement.com.ua
mikrozaym2you.ru
minegam.com
mirobuvi.com.ua
mirtorrent.net
mksport.ru
mobilemedia.md
mockupui.com
modforwot.ru
modnie-futbolki.net
moinozhki.com
monetizationking.net
money-for-placing-articles.com
money7777.info
moneytop.ru
moneyzzz.ru
mosrif.ru
mostorgnerud.ru
moy-dokument.com
moyakuhnia.ru
muscle-factory.com.ua
musichallaudio.ru
mybuh.kz
myftpupload.com
myplaycity.com
nachalka21.ru
nanochskazki.ru
needtosellmyhousefast.com
net-profits.xyz
nevapotolok.ru
newsrosprom.ru
newstaffadsshop.club
niki-mlt.ru
nizniynovgorod.dienai.ru
novosti-hi-tech.ru
nufaq.com
o-o-11-o-o.com
o-o-6-o-o.com
o-o-6-o-o.ru
o-o-8-o-o.com
o-o-8-o-o.ru
obsessionphrases.com
odiabetikah.com
odsadsmobile.biz
ofermerah.com
office2web.com
officedocuments.net
ogorodnic.com
online-binbank.ru
online-hit.info
online-intim.com
online-mkb.ru
online-templatestore.com
online-vtb.ru
onlinetvseries.me
onlywoman.org
ooo-olni.ru
optsol.ru
orakul.spb.ru
osteochondrosis.ru
ownshop.cf
ozas.net
paidonlinesites.com
palvira.com.ua
pc-services.ru
perm.dienai.ru
perper.ru
petrovka-online.com
photo-clip.ru
photokitchendesign.com
picturesmania.com
pills24h.com
piulatte.cz
pizza-imperia.com
pizza-tycoon.com
pk-pomosch.ru
pk-services.ru
podarkilove.ru
podemnik.pro
podseka1.ru
poiskzakona.ru
pokupaylegko.ru
popads.net
pops.foundation
popugaychiki.com
pornhub-forum.ga
pornhub-forum.uni.me
pornhub-ru.com
porno-chaman.info
pornoelita.info
pornoforadult.com
pornogig.com
pornohd1080.online
pornoklad.ru
pornonik.com
pornoplen.com
portnoff.od.ua
pozdravleniya-c.ru
priceg.com
pricheski-video.com
prlog.ru
procrafts.ru
prodaemdveri.com
producm.ru
prodvigator.ua
professionalsolutions.eu
prointer.net.ua
promoforum.ru
pron.pro
prosmibank.ru
prostitutki-rostova.ru.com
psa48.ru
punch.media
purchasepillsnorx.com
qualitymarketzone.com
quit-smoking.ga
qwesa.ru
rank-checker.online
rankings-analytics.com
ranksonic.info
ranksonic.net
ranksonic.org
rapidgator-porn.ga
rapidsites.pro
razborka-skoda.org.ua
rcb101.ru
realresultslist.com
rednise.com
regionshop.biz
releshop.ru
remkompov.ru
remont-kvartirspb.com
rent2spb.ru
replica-watch.ru
research.ifmo.ru
resell-seo-services.com
resellerclub.com
responsive-test.net
reversing.cc
rfavon.ru
rightenergysolutions.com.au
roof-city.ru
rospromtest.ru
ru-lk-rt.ru
ruinfocomp.ru
rulate.ru
rumamba.com
rupolitshow.ru
rusexy.xyz
ruspoety.ru
russian-postindex.ru
russian-translator.com
rybalka-opt.ru
sad-torg.com.ua
sady-urala.ru
saltspray.ru
sanjosestartups.com
santaren.by
santasgift.ml
santehnovich.ru
savetubevideo.com
savetubevideo.info
scansafe.net
scat.porn
screentoolkit.com
scripted.com
search-error.com
searchencrypt.com
security-corporation.com.ua
sell-fb-group-here.com
semalt.com
semaltmedia.com
seo-2-0.com
seo-platform.com
seo-smm.kz
seoanalyses.com
seocheckupx.com
seocheckupx.net
seoexperimenty.ru
seojokes.net
seopub.net
seoservices2018.com
sexsaoy.com
sexyali.com
sexyteens.hol.es
shagtomsk.ru
share-buttons-for-free.com
share-buttons.xyz
sharebutton.io
sharebutton.net
sharebutton.to
shnyagi.net
shoppingmiracles.co.uk
shops-ru.ru
sibecoprom.ru
sim-dealer.ru
simple-share-buttons.com
sinhronperevod.ru
site-auditor.online
site5.com
siteripz.net
sitevaluation.org
skinali.com
sladkoevideo.com
sledstvie-veli.net
slftsdybbg.ru
slkrm.ru
slomm.ru
slow-website.xyz
smailik.org
smartphonediscount.info
snabs.kz
snegozaderzhatel.ru
snip.to
snip.tw
soaksoak.ru
sochi-3d.ru
social-button.xyz
social-buttons-ii.xyz
social-buttons.com
social-traffic-1.xyz
social-traffic-2.xyz
social-traffic-3.xyz
social-traffic-4.xyz
social-traffic-5.xyz
social-traffic-7.xyz
social-widget.xyz
socialbuttons.xyz
socialseet.ru
socialtrade.biz
sohoindia.net
solitaire-game.ru
solnplast.ru
sosdepotdebilan.com
souvenirua.com
sovetskie-plakaty.ru
soyuzexpedition.ru
sp-laptop.ru
sp-zakupki.ru
spb-plitka.ru
spb-scenar.ru
speedup-my.site
spin2016.cf
sportwizard.ru
spravka130.ru
spravkavspb.net
stavimdveri.ru
steame.ru
stiralkovich.ru
stocktwists.com
store-rx.com
stream-tds.com
stroyka47.ru
studentguide.ru
success-seo.com
sundrugstore.com
superiends.org
supermama.top
supervesti.ru
svetka.info
svetoch.moscow
t-machinery.ru
t-rec.su
taihouse.ru
tattoo-stickers.ru
tattooha.com
td-perimetr.ru
technika-remont.ru
tedxrj.com
tentcomplekt.ru
teplohod-gnezdo.ru
texnika.com.ua
tgtclick.com
thaoduoctoc.com
theautoprofit.ml
theguardlan.com
thesmartsearch.net
tokshow.online
tomck.com
top-gan.ru
top-l2.com
top1-seo-service.com
top10-way.com
topquality.cf
topseoservices.co
track-rankings.online
tracker24-gps.ru
traffic-cash.xyz
traffic2cash.org
traffic2cash.xyz
traffic2money.com
trafficgenius.xyz
trafficmonetize.org
trafficmonetizer.org
traphouselatino.net
trion.od.ua
tsatu.edu.ua
tsc-koleso.ru
tuningdom.ru
twsufa.ru
ua.tc
uasb.ru
ucoz.ru
udav.net
ufa.dienai.ru
ukrainian-poetry.com
ul-potolki.ru
unibus.su
univerfiles.com
unlimitdocs.net
unpredictable.ga
uptime-as.net
uptime-eu.net
uptime-us.net
uptime.com
uptimechecker.com
uzpaket.com
uzungil.com
vaderenergy.ru
validus.pro
varikozdok.ru
veloland.in.ua
ventopt.by
veselokloun.ru
vesnatehno.com
viagra-soft.ru
video--production.com
video-woman.com
videos-for-your-business.com
viel.su
viktoria-center.ru
vodaodessa.com
vodkoved.ru
vzheludke.com
vzubkah.com
w3javascript.com
wallpaperdesk.info
wdss.com.ua
we-ping-for-youic.info
web-revenue.xyz
webmaster-traffic.com
webmonetizer.net
website-analytics.online
website-analyzer.info
website-speed-check.site
website-speed-checker.site
websites-reviews.com
websocial.me
weburlopener.com
wmasterlead.com
woman-orgasm.ru
wordpress-crew.net
wordpresscore.com
workius.ru
works.if.ua
worldmed.info
wufak.com
ww2awards.info
www-lk-rt.ru
x5market.ru
xkaz.org
xn-------53dbcapga5atlplfdm6ag1ab1bvehl0b7toa0k.xn--p1ai
xn-----6kcamwewcd9bayelq.xn--p1ai
xn-----7kcaaxchbbmgncr7chzy0k0hk.xn--p1ai
xn-----clckdac3bsfgdft3aebjp5etek.xn--p1ai
xn----7sbabhjc3ccc5aggbzfmfi.xn--p1ai
xn----7sbabm1ahc4b2aqff.su
xn----7sbabn5abjehfwi8bj.xn--p1ai
xn----7sbbpe3afguye.xn--p1ai
xn----7sbho2agebbhlivy.xn--p1ai
xn----8sbaki4azawu5b.xn--p1ai
xn----8sbarihbihxpxqgaf0g1e.xn--80adxhks
xn----8sbhefaln6acifdaon5c6f4axh.xn--p1ai
xn----8sblgmbj1a1bk8l.xn----161-4vemb6cjl7anbaea3afninj.xn--p1ai
xn----ctbbcjd3dbsehgi.xn--p1ai
xn----ctbfcdjl8baejhfb1oh.xn--p1ai
xn----ctbigni3aj4h.xn--p1ai
xn----ftbeoaiyg1ak1cb7d.xn--p1ai
xn----itbbudqejbfpg3l.com
xn--80aaajkrncdlqdh6ane8t.xn--p1ai
xn--80aanaardaperhcem4a6i.com
xn--80adaggc5bdhlfamsfdij4p7b.xn--p1ai
xn--80adgcaax6acohn6r.xn--p1ai
xn--90acenikpebbdd4f6d.xn--p1ai
xn--90acjmaltae3acm.xn--p1acf
xn--c1acygb.xn--p1ai
xn--d1abj0abs9d.in.ua
xn--d1aifoe0a9a.top
xn--e1agf4c.xn--80adxhks
xz618.com
yaderenergy.ru
yhirurga.ru
ykecwqlixx.ru
yodse.io
youporn-forum.ga
youporn-forum.uni.me
youporn-ru.com
yourserverisdown.com
zahvat.ru
zastroyka.org
zavod-gm.ru
zdm-auto.com
zdorovie-nogi.info
zelena-mriya.com.ua
zoominfo.com
zvetki.ru

View File

@ -0,0 +1,178 @@
These are community-contributed definitions for search engine and social network detections list maintained and used by [Matomo](https://matomo.org/) (formerly Piwik), the leading open source web analytics platform.
# Social Networks
Social networks are defined in YAML format in the file `Socials.yml`
The definitions contain the name of the social network, as well as a list of one or more urls.
```YAML
"My Social Network":
- my-social-network.com
- mysocial.org
```
# Search Engines
Search engines are defined in YAML format in the file `SearchEngines.yml`
Definitions of search engines contain several parameters that are required to be able to detect which search engine and which search keywords are included in a given url.
Those parameters are:
- name of the engine
- URLs of the engine
- request parameters (or regexes), that can be used to get the search keyword
- hidden keyword paths
- backlink pattern, that can be used to create a valid link back to the search engine (with the keyword)
- charsets that might be used to convert keyword to UTF-8
For each search engine (name) it is possible to define multiple configurations.
Each configuration needs to include one or more urls, one or more parameters/regexes and may include a backlink and one or more charsets.
## Configuration parameters
### urls
Each configuration needs to contain one ore more urls. Please only define the hostname.
You can use `{}` as a placeholder for country shortcodes in subdomains or tld.
- `{}.searchengine.com` would also match `de.searchengine.com` or `nl.searchengine.com`
- `searchengine.{}` would also match `searchengine.de` or `searchengine.nl`
#### Notes:
- For TLDs only `{}` would also match combined TLDs like `co.uk`. (Full list `com.*, org.*, net.*, co.*, it.*, edu.*`)
- The first URL will be used for the icon, so the most popular/representative URL should be placed there
### params
Each configuration needs to contain one or more params. A param is a name of a request param that might be available in the url.
As many search engines do not use query parameters to handle the keywords, but include them in the url structure, it is also possible to define a regex.
A regex need to be encapsulated by '/'
```YAML
SearchEngine:
-
urls:
- searchengine.com
params:
- q
- '/search\/[^\/]+\/(.*)/'
```
The example above would first try to get the keyword with the request param `q`. If that is not available it would use the regex `'/search\/[^\/]+\/(.*)/'` to get it.
This regex would match an url like 'http://searchengine.com/search/web/matomo'
### backlink
A backlink will be used to generate a link back to the search engine including the given keyword. backlinks may be defined per configuration and need to include `{k}` as placeholder for the keyword.
```YAML
SearchEngine:
-
urls:
- searchengine.com
params:
- q
backlink: '/search?q={k}'
```
For the configuration above the generated backlink would look like `searchengine.com/search?q=matomo` (assuming that `matomo` is the keyword).
#### Note:
The backlink will always be generated using the __first__ defined url in this configuration block.
### hiddenkeyword
More and more search engines started to hide keywords in referrers for privacy reasons. `hiddenkeyword` allows to define if the search engines refers from paths that may not contain/provide a keyword.
If a search engine always refers from the path `/do/search` that path should be added. If the path might vary regexes can be added with strings, starting and ending with `/`, e.g. `/\/search[0-9]*/`
NOTE: The path matched against will also include the referrers query string and hash. So if the referrer might contain a query you might use a regex like `/search(\?.*)?/`
```YAML
SearchEngine:
-
urls:
- searchengine.com
params: []
hiddenkeyword:
- '/^$/'
- '/'
- '/search'
```
The configuration above would allow an empty keyword for `searchengine.com`, `searchengine.com/` and `searchengine.com/search`
### charsets
Charsets can be defined if search engines are using charsets other than UTF-8. The provided charset will be used to convert any detected search keyword to UTF-8.
## Simple definition
A simple defintion of a search eninge might look like this:
```YAML
SearchEngine:
-
urls:
- searchengine.com
- search-engine.org
params:
- q
- as_q
```
The example above would match for the hosts `searchengine.com` and `search-engine.org` and use the request parameters `q` and `as_q` (in this order) to detect the search keyword.
## Multiple configurations
A simple definition of a search engine with multiple configurations might look like this:
```YAML
SearchEngine:
-
urls:
- searchengine.com
params:
- as_q
-
urls:
- search-engine.org
params:
- q
```
The definition above would again match for the hosts `searchengine.com` and `search-engine.org`. However the request parameter `q` will be used specifically for `search-engine.org` while the request parameter `as_q` will be used specifically for `searchengine.com`.
## Complete definition
A complete definition (including all optionals) of a search engine might look like this:
```YAML
SearchEngine:
-
urls:
- searchengine.com
params:
- q
backlink: '/search?q={k}'
charsets:
- windows-1250
-
urls:
- search-engine.org
params:
- as_q
-
urls:
- search-engine.fr
params: []
hiddenkeyword:
- '/'
- '/^search.*/
```
In this case, a backlink and charset is only defined for the first configuration. Which means there is no backlink nor charset set for `search-engine.org`.
# Contribute
We welcome your contributions and Pull requests at [github.com/matomo-org/searchengine-and-social-list](https://github.com/matomo-org/searchengine-and-social-list/edit/master/README.md)!

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,256 @@
Badoo:
- badoo.com
Bebo:
- bebo.com
BlackPlanet:
- blackplanet.com
Buzznet:
- buzznet.com
Classmates.com:
- classmates.com
Cyworld:
- global.cyworld.com
Gaia Online:
- gaiaonline.com
Geni.com:
- geni.com
GitHub:
- github.com
Google%2B:
- plus.google.com
- url.google.com
- com.google.android.apps.plus
Douban:
- douban.com
Dribbble:
- dribbble.com
Facebook:
- facebook.com
- fb.me
- m.facebook.com
- l.facebook.com
Fetlife:
- fetlife.com
Flickr:
- flickr.com
Flixster:
- flixster.com
Fotolog:
- fotolog.com
Foursquare:
- foursquare.com
"Friends Reunited":
- friendsreunited.com
Friendster:
- friendster.com
gree:
- gree.jp
Haboo:
- habbo.com
"Hacker News":
- news.ycombinator.com
hi5:
- hi5.com
Hyves:
- hyves.nl
identi.ca:
- identi.ca
instagram:
- instagram.com
- l.instagram.com
lang-8:
- lang-8.com
Last.fm:
- last.fm
- lastfm.ru
- lastfm.de
- lastfm.es
- lastfm.fr
- lastfm.it
- lastfm.jp
- lastfm.pl
- lastfm.com.br
- lastfm.se
- lastfm.com.tr
LinkedIn:
- linkedin.com
- lnkd.in
- linkedin.android
LiveJournal:
- livejournal.ru
- livejournal.com
MeinVZ:
- meinvz.net
Mixi:
- mixi.jp
MoiKrug.ru:
- moikrug.ru
Multiply:
- multiply.com
my.mail.ru:
- my.mail.ru
MyHeritage:
- myheritage.com
MyLife:
- mylife.ru
Myspace:
- myspace.com
myYearbook:
- myyearbook.com
Nasza-klasa.pl:
- nk.pl
Netlog:
- netlog.com
Odnoklassniki:
- odnoklassniki.ru
Orkut:
- orkut.com
Ozone:
- qzone.qq.com
Pinterest:
- pinterest.com
- pinterest.ca
- pinterest.ch
- pinterest.co.uk
- pinterest.de
- pinterest.dk
- pinterest.es
- pinterest.fr
- pinterest.ie
- pinterest.jp
- pinterest.nz
- pinterest.pt
- pinterest.se
Plaxo:
- plaxo.com
reddit:
- reddit.com
- np.reddit.com
- pay.reddit.com
Renren:
- renren.com
Skyrock:
- skyrock.com
Sonico.com:
- sonico.com
StackOverflow:
- stackoverflow.com
StudiVZ:
- studivz.net
Tagged:
- login.tagged.com
Taringa!:
- taringa.net
Telegram:
- web.telegram.org
- org.telegram.messenger
Tuenti:
- tuenti.com
tumblr:
- tumblr.com
Twitter:
- twitter.com
- t.co
Sourceforge:
- sourceforge.net
StumbleUpon:
- stumbleupon.com
Vkontakte:
- vk.com
- vkontakte.ru
YouTube:
- youtube.com
- youtu.be
V2EX:
- v2ex.com
Viadeo:
- viadeo.com
Vimeo:
- vimeo.com
vkrugudruzei.ru:
- vkrugudruzei.ru
WAYN:
- wayn.com
Weibo:
- weibo.com
- t.cn
WeeWorld:
- weeworld.com
"Windows Live Spaces":
- login.live.com
Xanga:
- xanga.com
XING:
- xing.com

View File

@ -0,0 +1,8 @@
{
"name": "matomo/searchengine-and-social-list",
"description": "Search engine and social network definitions used by Matomo (formerly Piwik)",
"license": "CC0-1.0",
"replace": {
"piwik/searchengine-and-social-list":"*"
}
}

View File

@ -0,0 +1,20 @@
{
"name": "searchengine-and-social-list",
"version": "1.4.1",
"description": "Search engine and social network definitions used by Matomo (formerly Piwik)",
"keywords": [
"searchengine",
"social network"
],
"main": "SearchEngines.yml",
"repository": {
"type": "git",
"url": "git://github.com/matomo-org/searchengine-and-social-list.git"
},
"homepage": "https://github.com/matomo-org/searchengine-and-social-list#readme",
"bugs": {
"url": "https://github.com/matomo-org/searchengine-and-social-list/issues"
},
"author": "Matomo Team",
"license": "CC0-1.0"
}