More

cyrusshepard · on Oct 26, 2023

Right. And his new, non-venture capital model for raising money for his next company seemed to work pretty well: https://sparktoro.com/blog/raised-a-very-unusual-round-of-fu...

cyrusshepard · on Oct 14, 2022

Your comment seems to assume that 1) employment levels for software engineers is the same in India 2) that the OP has overstated the social stigma of layoffs in India, and finally 3) the current employment landscape that you find yourself in will last indefinitely.

Based on this, I suspect it's highly likely the hubris in this comment will not age well.

Test0129 · on Oct 15, 2022

1) India is the chief exporter of software engineering for the world. There are a billion people in India (or more). So naturally on a per-population basis the number will be lower. Given that there are nearly entire towns dedicated to tech services I doubt finding a new job is difficult for those that want one. Especially given how huge labor arbitrage is in the west right now.

2) No idea. My Indian contractor coworkers never seemed to care. It's the H1Bs that panic during a layoff (understandably).

3) I don't think it will, that's why I get mine while the gettin' is good and hopefully have enough to not worry about the future whenever the FAANGs finally succeed at driving wages into the ground. However, given there are nearly 4 jobs for every engineer currently in the field according to the BLS I'm not worried at all about the next decade. I'll just continue to stack cash and re-evaluate every once in a while.

xyproto · on Oct 14, 2022

Regarding 3), I believe that solving problems and creating systems will always be relevant.

catothedev · on Oct 14, 2022

True. In fact, at my last interview they looked hard at my resume and then asked, "And so why should we hire you?".

I snapped off my sunglasses and, trying to contain my annoyance, said, "Guys, I solve problems and create systems."

After a long moment of silence, they looked at each other and nodded in approval. The boss reached out his hand to shake mine. I was hired.

jonnydubowsky · on Oct 14, 2022

And I just happened to be in the next room filming this epic moment...which included a Rudy-style slow clap building to a frenzy.

https://media.tenor.com/L0RMEbQrLXwAAAAM/meme-sunglasses.gif

spaceman_2020 · on Oct 14, 2022

Most startups doing these layoffs are not really solving any real problems. Many are wrapping up existing solutions in a better UI/UX. Which might be nice to have in an upcycle, but in a downtrend, will be the first on the chopping block.

cyrusshepard · on Jan 27, 2022

My guess is because it was self-submitted, it was held to a higher standard. Fair enough.

And in some universes the title could be considered click bait, although it is accurate in this case.

Moderation is a tough job. You never win.

That said, the revised HN title seems like it was written by bad AI. The point seemed to be to drive it off the homepage. In that, the HN title succeeded.

Regardless, I'm happy the article generated a lot of interesting discussion before manually being deemed unfit.

It is curious how at the same time the title changed, all the top comments (which were generally supportive) got pushed to the bottom. And now, including yours. Assuming this article touched a nerve at the same time someone was having a bad day.

cyrusshepard · on Jan 27, 2022

Author here. Frustrating situation. As the title is long at 84 characters, we know that Google is definitely going to rewrite it. The simplest way is to break it into parts and get rid of the shortest part that still makes sense.

So maybe take

'Towards Platform Democracy: Policymaking Beyond Corporate CEOs and Partisan Pressure'

And 1) condense it and 2) lose the colon

'Platform Democracy is Policymaking Beyond CEOs & Partisanship' (60 characters)

If that is too condensed, you could try a a short title in the <title> and a longer title in the copy.

cyrusshepard · on Jan 27, 2022

So an interesting distinction here is required! When Google says they use the title 80% of the time, they mean they use the title 80% of the time to create their search result title, which they may or may not modify. The other 20% of the time they use an H1 or other elements on the page.

cyrusshepard · on Nov 16, 2020

I developed painful gout at 40. Now 10 years later, I hardly ever have any pain at all.

Lots of research link gout to different makeups of gut bacteria:

https://www.nature.com/articles/srep20602

And I certainly changed my diet over the past 10 years, and my gut bacteria has become more diverse and presumably more healthy as a result. I've verified this through various stool tests and 3rd party microbiome trackers over the years (uBiome, Thryve)

I don't want to get into pseudoscience or offer anyone advice, but I can tell you that my experience with increasing specific bacteria that increase butyric acid within the gut has seemed to work for me. In theory, one could do this by eating butyrate promoting foods such as almonds, apples, barley, kiwifruit, and more, or take a bifidobacteria-enhancing fiber such as GOS.

will_pseudonym · on Nov 16, 2020

I didn't know what GOS meant. It's galacto-oligosaccharide.

https://en.wikipedia.org/wiki/Galactooligosaccharide

https://www.webmd.com/vitamins/ai/ingredientmono-1462/galact...

cyrusshepard · on Aug 13, 2019

I think what folks are missing is that a lot of these "zero-click" searches happen as a result of Google scraping your website, and displaying the results as a "featured snippet."

Yes, they link to you below the featured snippet.

No, more people don't click, because they've taken the answer from your website and displayed it right in their search results.

For example: If I'm searching for "best nail for cedar wood" Google gives me the answer: STAINLESS STEEL - and I never had to click through to the website that gave the answer: https://bit.ly/2MdovdP

• Yes, this is good for users (it would also be good for users if Netflix gave away movies free)

• Overall, the publishers who "rank" for this query receive fewer clicks

• Google earns more ad revenue as users stick around on Google longer

Ironically, Google has a policy against scraping their results, but their whole business model is predicated off scraping other sites and making money off the content - in many cases never sending traffic (or significantly reduced traffic) to the publisher of the content.

reaperducer · on Aug 13, 2019

No, more people don't click, because they've taken the answer from your website and displayed it right in their search results.

It's for this reason that's I've stopped embedding micro data in the HTML I write.

Micro data only serves Google. Not my clients. Not my sites. Just Google.

Every month or so I get an e-mail from a Google bot warning me that my site's micro data is incomplete. Tough. If Google wants to use my content, then Google can pay me.

If Google wants to go back to being a search engine instead of a content thief and aggregator, then I'm on board.

henryfjordan · on Aug 13, 2019

I just got one of those emails for the first time about my personal site that's basically my resume. Apparently my text is small on mobile (it's not...) and some other crap

I don't get why google thinks it's acceptable to critique my site without prompting. It honestly just feels rude. They want me to do a whole bunch of micro-optimizations on a site that already works fine because it doesn't fit their standard of "high quality". I think I've gotten exactly 0 clicks from Google search results ever and I don't really ever want any.

If it were possible to get a human's attention at Google I'd start sending my own criticism their way but of course it doesn't work like that...

jefftk · on Aug 14, 2019

I was curious what it was complaining about, since https://henryfjordan.com looks great to me. I tried to run it through Google's "Mobile Friendly Test" but fetching failed [1] because your robots.txt has:

    User-agent: *
    Disallow: /

This would explain why you've gotten zero clicks from Google (or I would guess anyone else's) search results!

On the other hand, it's surprising that you would get a notification if you had crawling disabled. Did you set this robots.txt up recently?

[1] https://search.google.com/test/mobile-friendly?id=97_WUiIxx-...

(Disclosure: I work at Google, commenting only for myself)

SahAssar · on Aug 14, 2019

Google seems to see robots.txt as "more what you call guidelines, than actual rules". Sites that block googlebot or all bots with robots.txt still turn up in google searches, just without a description, and are obviously still indexed.

jefftk · on Aug 14, 2019

robots.txt is a tool to control crawling, not to specify how you would like your site to be displayed (or not) in search results. If you don't want search engines to include your site, set:

    <meta name="robots" content="noindex">

while to block just Google do:

    <meta name="googlebot" content="noindex">

See https://support.google.com/webmasters/answer/93710

If Googlebot is not respecting robots.txt, and is crawling something it's been instructed not to crawl, let me know and I can file a bug?

(Disclosure: I work for Google but not on Search, speaking only for myself)

areyousure · on Aug 14, 2019

But that requires that Googlebot be allowed to crawl the page in robots.txt in the first place.

How do you tell Googlebot to not crawl your site and to not index it either?

Previously, one could use the undocumented "Noindex" directive in robots.txt, but this will be disabled soon: https://webmasters.googleblog.com/2019/07/a-note-on-unsuppor...

tylerl · on Aug 14, 2019

The bot doesn't need to crawl your site for it to be indexed; it crawls other sites that link to yours.

You can specify your index preferences in Webmaster Tools. Don't know if there's a domain-wide off switch in there, but there probably is.

SahAssar · on Aug 14, 2019

Using Webmaster Tools is not a good option since it requires you register with the exact company you are probably trying to not interact with.

jefftk · on Aug 14, 2019

The blog post you link has a bunch of alternatives, but I agree they're not great. If there are a lot of webmasters who want to be able to noindex through robots.txt then making the case for adding noindex to the standard would be a good next step.

(Still speaking only for myself)

ademarre · on Aug 14, 2019

Googlebot actually used to support a noindex rule in robots.txt, but they are removing it.

https://webmasters.googleblog.com/2019/07/a-note-on-unsuppor...

jefftk · on Aug 14, 2019

Yes, that was linked above. It looks like this is part of reducing support to what's in the spec?

ademarre · on Aug 15, 2019

Oops, yep. I didn't see that context.

SahAssar · on Aug 14, 2019

I sent you an email, and I'm posting it here but without identifying info:

---

Hi Jeff,

Thank you for your comment, I'm replying via email to send some info I'd rather not share on HN, but will post the same redacted in HN. I used to (back when starting my web-dev career) run a one man show development team of a web agency and all our development/pre-prod sites (that had to be unauthed) had robots.txt to disallow all bots, but they still popped up in Google. Searching some of the old domains in google I found an example here: http://***.***/***, and attached is an example of it showing up in a SERP and a what the robots.txt looks like (and I'm pretty sure that the robots.txt has looked like that since that page was created).

In this case it is just one page that nobody will care about, and since I'm not working on projects that are open but "robots.txt hidden" anymore I don't know if it is as bad as it used to be, but I regularly see pages with the "No information is available for this page" whose domains have robots.txt's that disallow all bots but still show up in Google.

Please let me know if I missed anything :)

jefftk · on Aug 14, 2019

Thanks for sending the screenshot! That site shows up with "no information is available for this page", which means that while robots.txt has disallowed bots from crawling it the page is still linked from other pages that do allow crawling.

The robots.txt protocol gives instructions to crawlers about how they should interact with the site. If you instead want to give instructions to indexes, use the noindex meta tag.

SahAssar · on Aug 14, 2019

You're right, I was wrong about how to expect a "Disallow: /" to work. But isn't it sorta odd to have a protocol to control crawling (which is usually done to index) but (almost) require a compliant indexer to crawl all pages to comply with the indexing rules?

In this example the robots.txt has clearly told all bots to not crawl this site, but the only way to read the meta tag (or equivalent header) is to crawl the site. So I assume that in this case google either assumes that it is fine to crawl URL's that it has found elsewhere while ignoring the robots.txt or it assumes that pages disallowed by robots.txt are "open for indexing/linking", which would mean that any page both disallowed by robots.txt and which has a noindex meta tag would still show up, right?

What is the intended behavior if a page is disallowed by robots.txt and still linked by another indexed page? Will it get crawled or just assumed to be okay for indexing/linking? Is there any way to tell Google not to index/link and not to crawl?

erik_seaberg · on Aug 15, 2019

If you have a calendar where every month links to the previous and next months, a crawler can get stuck and hammer the server. That's the kind of thing robots.txt is for.

dorgo · on Aug 14, 2019

>"more what you call guidelines, than actual rules"

they can index without scraping. It is enough that other websites have links to you site. So the google bot follows the rules in robots.txt to the letter. "no-index" is the way to stay away from google.

SahAssar · on Aug 14, 2019

They can't read my no-index if they obey my robots.txt. Do they break the robots.txt to be able to read my no-index or do they assume my "Disallow: /" means I'm fine with them indexing/linking?

Without the noindex part of robots.txt (which google decided to ignore not so long ago) this is not solvable.

henryfjordan · on Aug 14, 2019

Oh, I just added that yesterday as a response to the email. Before that I was actually running Google Analytics but since I get basically 0 clicks it wasn't really useful.

I have a feeling the PDF viewer triggered it, cause on Mobile it defaults to showing the whole page which results in tiny text but that's easily fixed by the user so I prefer to leave it like that.

grogenaut · on Aug 13, 2019

Yeah it's amazing how rapidly and rabidly they show up when the complaint is on one of their paid features like a Google cloud (GCE) post for them or a competitor, but nada on the other products. Well no it's actually not surprising.

londons_explore · on Aug 14, 2019

Google cloud employees are encouraged to go on social media to get a feel for issues users are having and to make the product better.

The rest of Google has a policy of "Engineers will probably say the wrong thing if we let them talk in public"

kaolti · on Aug 14, 2019

Google has grown into a cancerous middleman.

michaelmior · on Aug 13, 2019

> If Google wants to go back to being a search engine

While I understand the problems with Google scraping content, as a user these snippets help me find what I'm searching for faster. If that's all you're optimizing for, Google is fantastic. There are certainly good arguments to be made for other models, but for search, stealing content helps. I'm not advocating stealing content, I'm just saying that it produces more useful results.

elorant · on Aug 14, 2019

How do you know that the content Google features is the best there is? If we stop clicking on sites and just rely on Google to provide us the content we'll go down a very slippery slope.

BigJono · on Aug 14, 2019

I don't really see how this problem is any different to 'how do we know the #1 search result is the best content there is?', if it provides you the information you want, then great, otherwise you load #2.

millstone · on Aug 14, 2019

Google lends the weight of its authority to the answers it presents. It's one thing if Infowars says that Obama is planning a coup against Donald Trump, it's another if Google says so.

w1nst0nsm1th · on Aug 14, 2019

Try googling "root M89 tablet".

The first three result lead you to fake android blog telling you how you can easily root every chinese android device and specifically the M89 tablet...

The real authoritative result (xda-developers) only appears in the fourth position, under sight. It will tell you if you follow the instruction given in the fake blog post from the 2 or 3 first results, you will brick your tablet.

In a similar way the word "cbd" (for cannabidiol) has been hijacked by dubious commercial compagnies through fake blog posts filling pages after pages of google results telling you how great cbd is for the treatment of every disease on earth... But there is no trace of an actual study in these results. You will have to go with the less popular word "cannabidiol" to start to see some serious articles about it.

Google results can be hijacked and Google do little about it. May be because the ads shown in these fake blog posts are from google ads network ? I don't know...

But google result have clearly deteriorated these last years and the authoritative figure of the companie is not anymore what it was in the past.

playpause · on Aug 14, 2019

I know that sort of thing happens sometimes (Google presenting a spurious statement as a categorical answer) but those are bugs. As long as they are very rare, and fixed quickly when they occur, I don’t see them causing much harm.

OK, some people believe anything they read (especially if it confirms their existing biases), but that problem has always existed. I think Google’s occasional snippet fuck-ups are a drop in the ocean compared to the spread of false information through social networks.

millstone · on Aug 14, 2019

There's the modern news-cycle axis, where Google can and should devote full-time engineers.

But the long tail is important too. It's fixed now (yay) but for years you could search for "calories in corn" and Google would confidently present an answer 5x the true value, scraped from a site with profoundly wrong information. As Google moves to present more direct answers and fewer links, this risk increases.

It looks like they have backed off on the direct answers somewhat which is good news.

DollarGuru · on Aug 14, 2019

If it undermines the websites producing the content Google is scraping by not sending through traffic then those sites may not continue to exist.

londons_explore · on Aug 14, 2019

This is already happening.

Very few new blogs and content websites are being set up.

All content is moving into apps and walled gardens. Part of the reason for that is that running a well researched blog will never pay for your time, so becomes a hobby thing, and most people are fine to use Facebook for that.

wutbrodo · on Aug 14, 2019

> Micro data only serves Google. Not my clients. Not my sites. Just Google.

Well it also serves Google's users, to be clear. Though I should also be clear that I don't think that justifies it, since I think it's bad for the ecosystem in more subtle ways than are expressed in immediate user satisfaction.

tremon · on Aug 14, 2019

That depends on how you define "users". If you define a website creator also as a Google user (by virtue of wanting to be found through Google), then Google is serving part of its users to the detriment of their other users.

And if you view Google instead as a connection broker, e.g. a middle-man between publisher and consumer, then Google is destroying their own business by snubbing publishers. Assuming that Google is still making rational, intelligent decisions, it follows that Google no longer sees itself like that.

dyarosla · on Aug 14, 2019

Did Google ever see itself as prioritizing publishers and consumers equally? I think that’s a false premise and the parent is right; Google’s priority has always been consumer first.

dragonwriter · on Aug 14, 2019

> If Google wants to go back to being a search engine instead of a content thief and aggregator

A search engine is inherently a content aggregator; the functions are inseperable.

antonvs · on Aug 14, 2019

Not necessarily. Google used to be more of a link aggregator. There's a difference, as the OP proves.

dragonwriter · on Aug 14, 2019

Google (and virtually every other search engine) has always included content with links, what's different now (but not unique to Google, though they are perhaps the most advanced at it) is that now it algorithmically synthesizes content instead of merely aggregating it.

throwawayjava · on Aug 14, 2019

It does help your clients.

I mean, maybe not yours specifically. But snippets are great for users in the typical case.

pyrale · on Aug 14, 2019

These users are no longer his clients.

sli · on Aug 13, 2019

On top of all that, Google's snippets aren't curated and therefore, aren't always correct. They can be (and almost certainly are) gamed. Users that don't click through open themselves up to carrying on being misinformed.

squeaky-clean · on Aug 13, 2019

I've found them to be incorrect so often on things when I would click through to the actual page or find a better link. I don't trust just the blurb for any answers any more.

reaperducer · on Aug 13, 2019

I don't trust just the blurb for any answers any more.

I don't, either.

A site I used to own had a discussion forum on it. It contained a message along the lines of "Real Estate Agent X is a great guy. Real Estate Agent Y is a complete sleazebag."

The blurb that Google displayed for it was "Real Estate Agent X... is a sleazebag." And that was the first result for anyone who searched for that agent's name.

As you can imagine, I received many angry e-mails, phone calls, and legal threats. No, you can't explain to angry people that it's "just" an algorithm that told the world that they're a sleazebag.

I ended up editing the post so that Google would display a different version after its next scrape.

londons_explore · on Aug 14, 2019

I think there's more to this... Google use lots of fancy Natural Language Processing stuff to extract that data, and unless the wording was very tortuous, I doubt it could make such a big mistake by chance.

ribosometronome · on Aug 13, 2019

They can get it painfully wrong last time. I came down with something like optic neuritis a few years ago. It's often one of the first signs of MS in many folk. When I googled something like "MS life expectancy", the blurb said something like "3-7 years" -- with subtext indicating it's 3-7 years LESS than average rather than "you're kicking it in 3 years".

Turns out I didn't have optic neuritis.

benoliver999 · on Aug 13, 2019

They suck. And something about the way they are presented seems to make people believe them.

I think it gives that one-shot answer to questions people have, even when the real answer is nuanced and multi-faceted.

moksly · on Aug 14, 2019

I think they’re believable because google started by providing things that weren’t wrong. If you search for a time zone google shows it in your local time, if you search for currency conversion google does that. All those things that it’s done for ages, which were things that were also typically correct.

Then the snippets show up, and they are presented in a similarly trust worthy fashion. But the snippets are really just the really just the result of which ever site has the best SEO, and that’s often a really worthless metric these days. The time zone and currency stuff is easy, because it’s math, but opinions aren’t. The thing is though that even if google didn’t have the snippets, those sites that gets snippets would still be the top results that we clicked, and we’d still get the wrong information. That would probably be better, because it might be easier to spot obvious bad sources, but I still think there is just a fundamental flaw in how SEO professionals have learned to game the google bot to bring the world useless information.

I mean, part of it is certainly on google. No one in their right mind wants to comply with Google’s ranking terms, unless you make money from google searches. Which means a lot of useful personal blogs have dropped off the face of the internet, unless you’re really lucky to see them linked on a place like HN.

I wish libraries would band together and make a privacy focused and curated search engine, because librarians are actually kind of good at finding you the correct information.

sharatvir · on Aug 14, 2019

It sucks. Sometimes the bold text is the exact opposite of the answer to the query I search for. It’s very misleading unless you click through and read the full context.

meowface · on Aug 13, 2019

Yeah. I personally like the feature, in theory, as an end user, but the signal:noise ratio for it has not been great for me.

rhizome · on Aug 13, 2019

This is especially true where the answer is time-bound, which happens a lot in technical topics. Many times the snippet is for an earlier version of the language (but still with a high PageRank), or the Operating System (especially Android settings), and the most annoying at all: an ancient answer in an undated blog post.

londons_explore · on Aug 14, 2019

Google is good at dating undated content. They keep track of the first time they've ever seen a bit of text, and assume it was composed then, even if it later gets copied to other sites.

pbiggar · on Aug 13, 2019

For a recent search "report amex card stolen", google showed a phone number for a scam who asked for a social security number as soon as you called.

concert-gilled · on Aug 13, 2019

The websites that the results aren't curated either. Clicking through to the site could provide the same incorrect information.

perl4ever · on Aug 13, 2019

The point is that Google frequently adds another level of incorrectness, that may not be identifiable without checking the source. This is pretty common on Wikipedia, and when people link to things in discussion forums, as well.

And anything Google does, is done at vast scale, which makes me, at least, think it might be substantially affecting society.

mcv · on Aug 13, 2019

But that's the responsibility of that website. Of course it's bad if Google lists a site with wrong information as the first hit, but I think it's worse when Google blindly copies that false info and lists it as their own zero-click result. By doing that, Google itself takes responsibility for the information.

Although sometimes the site is actually correct and Google still gets it wrong by copying the info incorrectly or losing some context or qualifiers.

I loved zero-click results back when DucfDuckGo first introduced them, but I'm less enthusiastic about Google's implementation of them.

buboard · on Aug 13, 2019

sometimes the blurb just has an answer to a different question. Websites are curated, unless its spam.

dragonwriter · on Aug 14, 2019

> Websites are curated, unless its spam.

Yes, but even when they are curated the curators are usually unreliable and sometimes malicious.

buboard · on Aug 14, 2019

snippets are just a reflection of that. how is google faring better in that respect?

wolco · on Aug 13, 2019

Those are sites google chooses are correct.

sameers · on Aug 14, 2019

For example, this WaPo story, about YouTube videos for some medical queries that go to videos featuring quack remedies and anti-vaxxer misinformation.

https://www.washingtonpost.com/lifestyle/style/they-turn-to-...

dragonwriter · on Aug 14, 2019

> On top of all that, Google's snippets aren't curated and therefore, aren't always correct.

The “therefore” is misplaced; curated snippets aren't always correct, either.

minor3rd · on Aug 13, 2019

People on the web take the risk of being misinformed, clicking or not.

charlesju · on Aug 13, 2019

It's important to note that this is strategically incredibly important for Google because this forms the backbone of their voice AI. The better at answering questions directly, the better their voice AI becomes and that leads to a lot of future products.

soup10 · on Aug 13, 2019

AdWords is and always has been the goose that lays the golden eggs, none of Google's other initiatives have ever rivaled that revenue. That's why they put so much effort into bolstering and optimizing their search results pages.

HenryBemis · on Aug 13, 2019

Another reason is the use of add-ons such as: "Google search link fix - Prevents Google and Yandex search pages from modifying search result links when you click them."

I have stopped using Google a few years ago, but just in case I keep this (or similar) add-ons of my Firefox.

I have no idea of the popularity of such addons, but they would also impact the tracking that Google does.

igravious · on Aug 13, 2019

Oh my God! This is so useful! I hate that I can't right-click on a search result to copy a URL. We definitely used to be able to do this, didn't we?

propogandist · on Aug 13, 2019

It's been this way for ages, although for chrome (iirc) this is managed via hyperlink auditing [1] which allows google to track what you're clicking even though the link appears 'clean'.

The click through google redirect also allows them to track things like relevancy of the content and time on site (if you return to google SERP by clicking the back button), in-case the target site isn't using google analytics (unfortunately most sites do).

[1] https://html.spec.whatwg.org/multipage /links.html#hyperlink-auditing

Hyperlink auditing can be blocked with uBlock Origin / uMatrix

jefftk · on Aug 14, 2019

Hmm, right clicking and copying works for me in Chrome and Safari. I just tried searching for "test" and the first result is marked up as:

    <a href="https://www.speedtest.net/"
       ping="/url?...>

Looking at https://caniuse.com/#feat=ping it looks like ping is supported in Chrome, Safari, and Edge, but not Firefox; are you using Firefox?

(Disclosure: I work for Google)

igravious · on Aug 14, 2019

I use both. I'll use this "Google search link fix" extension in Firefox until search results links aren't proxied.

wbl · on Aug 13, 2019

Don't like the product? Switch. While you still can.

jefftk · on Aug 14, 2019

Any search engine is going to want to know what people click on so they can make their product better. For example, I just searched for [test] on DuckDuckGo and when clicking on the first result I see DDG sending a ping back:

    https://improving.duckduckgo.com/t/lc?...

which contains which URL I clicked.

(Disclosure: I work for Google, speaking only for myself)

Hitton · on Aug 14, 2019

That's not true, for instance Startpage doesn't do that.

jefftk · on Aug 14, 2019

Startpage is an anonymizing proxy for Google Search, not a full search engine. Crucially, it doesn't determine how to rank results. If they decided to try to compete with Google, Bing, Yandex, DDG etc directly by bringing ranking in-house they would have a very hard time serving good results without being able to track which of their links were popular among users.

TheArcane · on Aug 13, 2019

I consider myself privacy conscious and have add-ons like muli-account containers, cookie auto-delete, UB Origin and Privacy Badger working in tandem.

It's embarrassing that I wasn't aware of this extension, given how useful it seems - thanks!

IGotThroughIt · on Aug 18, 2019

How safe are all these plugins we install to escape tracking? Are we trying to escape big tech tracking only to hand our information over to extension developers? Looking at network traffic often shows a ton of extensions sending data to some aws server almost perpetually.

Asking because I'm not sure of the answer to this question and lately I've become even warier so I decided to uninstall everything except things I absolutely must have like colorzilla, grammarly and full-page screen capture. For adblocking I use brave and never ever touch firefox, opera or chrome.

There's an extension that appends a share=1 parameter to all quora links to prevent them from forcing you to sign in in order to view a post. I like it but I'm trying to minimize my extensions footprint and I'd rather write my own script to perform the same script.

The question is, how do you get to be sure that an extension is safe?

majani · on Aug 15, 2019

Then the snippet should just be used for voice search. And websites should opt in to the program.

arcturus17 · on Aug 14, 2019

> Google has a policy against scraping their results, but their whole business model is predicated off scraping other sites and making money off the content

Yea a couple days ago I was checking the Places API, which they’ve built off user-generated content and scraping Yelp and others. They charge $17 / 1000 calls for certain items and don’t you dare cache anything for too long.

Great way to build a business: get data for free, wall it off and put a hefty price tag on it, then put your best lawyers around the moat for good measure!

londons_explore · on Aug 14, 2019

I downloaded all the places data for the world while it was still free. In my jurisdiction, the data is considered owned by the place owners rather than Google, so I doubt they'll come after me.

sverige · on Aug 14, 2019

That's the ancestry.com business model as well.

gniv · on Aug 13, 2019

I disagree. There is an implicit contract between website publishers and search engines that it’s ok to do this. The website can set nosnippet in robots if they want to not have the snippet in search results.

zymhan · on Aug 13, 2019

So by having a website, I implicitly agree to Google's search practices?

That doesn't seem right.

njharman · on Aug 14, 2019

You put a resource on an open network and don't use any of the standard, recognized methods to indicate don't index, don't share, (nor lock it away with auth).

It's like if you put a sculpture in front yard and get upset when someone points it out in their neighborhood tour company, even worse cause yard ornaments don't have standard accepted methods of saying "don't use".

Two choices

1) use robots.txt

2) don't put it on the internet

Silhouette · on Aug 14, 2019

You put a resource on an open network and don't use any of the standard, recognized methods to indicate don't index, don't share, (nor lock it away with auth).

This is the kind of argument people used to use as they flagrantly violated your copyright by cloning your article on their own site. "You put it on the Internet, so it's free for everyone to copy."

The law says no such thing, at least not in any jurisdiction that I'm familiar with. Contrary to popular belief in some quarters, normal laws do still apply on the Internet.

If you infringe copyright, it's still infringement even if what you copied was freely available on someone else's site.

And if you state something that is misleading and harmful, it might still be defamation, even if what you stated was just an automatically generated snippet that takes a small part of someone else's site and shows it out of context.

eitland · on Aug 14, 2019

Nah. Take it easy here, there is a long way between indexing and showing the most relevant hit and outright lifting big parts out of the site and use them on their own property:

It is more like if the guide that used to send visitors to your property has set up their own boot on the best spot on the sidewalk next to you and are raking in money because of the useless (often, in the last few years) ads they have plastered all over it.

Even if it is an educational non-profit resource you don't want that as some of the details get lost when visitors only reads the guides summary instead of taking a closer look for themselves.

And according to people on this thread they will also complain and/or come with suggestions about how you can make it even more useful to them.

didibus · on Aug 14, 2019

I think of it more as if you put a banner with content somewhere in the public, and I take a photo of it, what can I later do with that photo?

And for that, it's a question of copyright. It turns out, in the US, if something is publicly available it does not make the copyright a part of the public domain. Thus the original author still retains copyright unless explicitly stated otherwise.

There is an exception to this though, which is called fair use. And for that, I'd recommend reading this: https://amp.theatlantic.com/amp/article/411058/ Book snippets by Google searched were deemed fair use.

So the question remains, would website snippet similarly count as fair use? What will the federal courts rule be? And when it comes to fair use, that's the only way to know if it is or not.

Silhouette · on Aug 14, 2019

It's worth pointing out in this context that the US legal concept of fair use is not universal. In fact, unusually for US IP laws, it's actually much more permissive than most other places. The more usual practice is to enumerate specific situations where copying without the copyright holder's consent is still allowed, instead of defining general tests, which is how fair use works. This has been a controversial point, because it's not clear that the US scheme is sufficient to meet its obligations under international treaties.

In answer to your final question, I'm not sure whether this use of snippets in search engine results has been tested in any US courts yet, but the issue of search engines showing enough content from the sites they link to that users never actually go through to the original site is sufficiently controversial that the EU's recently passed copyright directive includes specific provisions aimed at exactly that sort of situation.

kabacha · on Aug 14, 2019

> It is more like if the guide that used to send visitors to your property

Here is where your argument falls apart. The web is a public space - it's not your property or your front yard. It's more akin to going to the town square wearing a fancy hat and getting upset if people look at you and your weirdly shaped headwear.

tremon · on Aug 14, 2019

The web is a public space - it's not your property or your front yard.

You're wrong here. Just because it's a public space does not mean nobody owns the property. As a simple example, a shopping street is usually a public place. That does not mean that all window displays, doorways and adjacent buildings are automatically a free-for-all.

In fact, only "the tubes" of the web are a public space. The rest is owned property, even if there are no visible fences.

pyrale · on Aug 14, 2019

Laws everywhere are pretty much saying your take is wrong. There is no such thing as an implicit contract, and your take on it is plain victim blaming.

It is very surprising to read this on a board where many people write code: if a dev found unlicensed code, they would certainly not think it is public domain.

cyrusshepard · on Aug 13, 2019

It's a devil's bargain. If you opt-out of snippets, it simply means somebody else claims the top spot, and you are left with even less traffic (by a significant amount)

gniv · on Aug 13, 2019

> If you opt-out of snippets, it simply means somebody else claims the top spot

Citation? I thought snippets are just for display, not ranking.

ribeyes · on Aug 14, 2019

Snippets link to the source URL, so getting into the snippet gets your link to top of the page.

dessant · on Aug 13, 2019

You don't have to inform anyone about your content not being redistributable, that is not how copyright works.

buboard · on Aug 13, 2019

> nosnippet

TIL. That's actually a good idea. Does that eliminate all kinds of snippets? NOARCHIVE may also be of use.

vageli · on Aug 13, 2019

> I disagree. There is an implicit contract between website publishers and search engines that it’s ok to do this. The website can set nosnippet in robots if they want to not have the snippet in search results.

Who made this contract? I never signed one. If I came to your place of business and copied your content and provided it somewhere else, I would be infringing your copyright. Do I have to put up signs specifying that at my place of business? Why is this any different? My web content is not the property of someone else and by publishing my information that is in no way an implicit grant of the right to reproduce it.

gniv · on Aug 13, 2019

I believe citing small pieces from a large text is covered by fair use.

mthoms · on Aug 14, 2019

It depends. One of the criteria for acceptable "fair use" is that the usage shouldn't negatively affect demand for the original source.

Although there are other criteria to consider, Google's snippets clearly violate that particular tenet.

See #4 https://fairuse.stanford.edu/overview/fair-use/four-factors/

milesskorpen · on Aug 13, 2019

It's a faustian bargin. Google is so powerful you can't do without them, but they're also inexorably eating your future.

Silhouette · on Aug 14, 2019

Google is so powerful you can't do without them

I wonder how true that assumption really is any more. The quality of traffic Google drives to sites I operate is very low compared to all other major sources, with much less engagement by any metric you like, notably including conversions. The only reliable exception is when we're running marketing campaigns in other places, which often result in spikes in both direct visitors landing on our homepage and search engine visitors arriving at our general landing pages.

There is this conventional wisdom that SEO, and in particular playing by Google's rules to rank highly in its results pages, is the only way you can run a viable commercial site these days. Our experience has been exactly the opposite: our SEO is actually quite effective, in that we do rank very highly for many relevant search terms, but it makes a relatively small contribution to anything that matters. And really, when I write "SEO" here, I'm only talking about general good practices like being fast, having a good information architecture and working well on different devices. We don't change the structure of our pages just because Google's latest blog post says X or Y is now considered a "best practice" or anything like that.

Of course I have no way to know how representative our experience is. YMMV.

milesskorpen · on Aug 16, 2019

It is a very significant part of our business.

scarface74 · on Aug 14, 2019

Yes you can. There are other ways to market yourself and your website. For instance, the author of “Fearless Negotiation” has appeared in four or five podcasts I follow. The well known pundits in the Apple ecosystem grew an audience organically through word of mouth.

Hoping to stand out on Google results as a business plan is recipe for failure. You are one algorithm change from going out of business.

buboard · on Aug 14, 2019

> There is an implicit contract

Then why can't publishers scrape google?

mkl · on Aug 14, 2019

From http://www.google.com/robots.txt:

    User-agent: *
    Disallow: /search
    ...

wolco · on Aug 13, 2019

It should be opt-in.

scohesc · on Aug 13, 2019

So they're like a modern ebaums world for the information age.

Interesting way to put it - the biggest bully with the most money wins!

buboard · on Aug 13, 2019

Funny to read in the html:

- This site is optimized with the Yoast SEO

- This site is optimized with the Schema plugin

Yeah, optimized to death

ineedasername · on Aug 13, 2019

Glad the first ranked response was this. It's what I came here to say. These days you simply don't need to click as often to get what you need out of a search, and Google's business model doesn't rely on click through to web sites, but for display & click through of ads.

bartimus · on Aug 13, 2019

I'm still on the fence somewhat.

Searching for "best car engine oil" has certain brands displayed straight on the featured snippet. Who cares about the click if Google found your customer for you and got your message through for free?

tremon · on Aug 14, 2019

In the end, Google should care. If a search for "best car engine oil" got your product featured, that means you won a sale. But assuming the sale happens completely offline, Google lost its opportunity to inform you of the search, and of the succesful search->sale conversion.

That means your marketing department can no longer justify investing money in Google SEO, which means less optimization towards Google's crawler, which means less reliable search results, which means less Google searches in the long run.

bartimus · on Aug 14, 2019

Increased profits from unknown sources VS decreased profits from known sources. The gained marketing intelligence may come at the cost of the bottom line.

aviraldg · on Aug 13, 2019

Feel free to add them to your robots.txt; they won't scrape you then (but they won't index or rank you either)

autokad · on Aug 13, 2019

I 0 click search more than I click because but not limited to: 1 ) to get the correct spelling of a word that spell check cant find a suggestion 2 ) avoid going to a site that I might potentially get malware (example searching music lyrics) 3 ) avoid having to deal with slow loading and bloated pages

Illniyar · on Aug 14, 2019

"• Google earns more ad revenue as users stick around on Google longer"

This one is actually reverse. Google search doesn't net google any money if people don't actually click the link, since ad revenue for google search is Per Click, not per view (per mille).

The incentives for them are actually reversed - increasing the amount of clicks into external websites, specifically advertised links, increases their revenue. (which is why there are so many advertised links on a search page)

tabtab · on Aug 15, 2019

I do a fair amount of grammar and spelling searches. Google often displays tips and examples. And typing "sp500" displays a stock chart right in Google itself. Google has a lot of "instant snippets" like that. Quite convenient. However, near-monopolies do make me nervous about supporting them.

ses1984 · on Aug 14, 2019

Does it matter if they have a policy against scraping? I thought that was explicitly legal, which enables their existence.

johnx123-up · on Aug 14, 2019

As I mentioned before in HN, this guy predicted "Google SEO bubble" http://rajeshanbiah.blogspot.com/2018/01/technology-predicti...

mxd3 · on Aug 14, 2019

Great point, this was my first thought also. Google has been doing a slow creep of this type of content for the past through years through the featured snippet you mentioned, and other knowledge panel material. They now serve sports, weather, math, translation, flights, etc.

singlow · on Aug 14, 2019

I actually searched for best nail for cedar this afternoon :-P. I clicked several of the articles though...

epiphanitus · on Aug 14, 2019

Speaking of scraping, does anyone know where one can get a hold of full text news articles/press releases for nlp research? Most APIs that I have found only offer partial texts.

I know that Aylien has an API for this but it's out of my price range.

mrtksn · on Aug 13, 2019

If I recall correctly content(news especially) publishers and some Europeans were very angry about that. I think the consensus was that these businesses don't understand the internet.

buboard · on Aug 13, 2019

in that case the news sites did get the click, but they wanted more

mrtksn · on Aug 13, 2019

How do they get the click? If so, what is the fair amount of clicks that businesses should get?

buboard · on Aug 14, 2019

When i go to google news, there are no snippets, just titles linking to newspapers

dplgk · on Aug 14, 2019

Why is this is not copyright infringement?

amelius · on Aug 13, 2019

Shouldn't they be liable for mis-information? Wouldn't that solve the entire problem?

mikeg8 · on Aug 13, 2019

I don’t see any way to actually achieve this at scale let alone any reason to add an opening for more pointless lawsuits. Let’s say they’re liable and you choose to act on incorrect information recieved for free. Do you really try to take them to court and on what grounds?

goatinaboat · on Aug 13, 2019

Yes, Google and similar companies should be 100% responsible for anything published on their platforms. No more “safe harbour”. They have chosen to take positions in many issues, that makes them more like newspapers than phone companies.

akersten · on Aug 13, 2019

Positions like what? And no, banning radicals from their platform for violating their terms of service is not a position.

Even if they were responsible, it's still legal to lie. You don't see pseudoscience websites being taken down because they are objectively false either.

goatinaboat · on Aug 13, 2019

It’s OK for the NYT to attempt to “prevent another Trump situation”. They have an editor and that person is legally responsible for what they publish. They don’t even pretend to be non-partisan. But Google takes a position then hides behind “common carrier” status. It’s not reasonable that they can pick and choose. Either they’re the phone company or they’re a publisher. It’s their right to be either of course, but they must choose.

Similarly for Twitter.

root_axis · on Aug 13, 2019

> It’s not reasonable that they can pick and choose. Either they’re the phone company or they’re a publisher. It’s their right to be either of course, but they must choose.

This is 100% wrong, the opposite is true. The law explicitly protects website operators from being liable for content posted by 3rd parties while simultaneously granting them the explicit freedom to curate content that they deem objectionable.

pyrale · on Aug 14, 2019

No content on Google is posted there by 3rd parties. Google does select what is displayed and, in the case of snippets, they go out of their traditional way to promote that content.

root_axis · on Aug 14, 2019

The use of the word "post" is my own colloquially imprecise language, the law actually states

> No provider or user of an interactive computer service shall be treated as the publisher or speaker of any information provided by another information content provider.

So content indexed by google absolutely falls under the definition of "provided by another information content provider"

pyrale · on Aug 14, 2019

Providing links is indeed within this definition. However, cards go beyond that: by selecting one out of the many results promoting it and possibly alterating its meaning by selecting which parts, and how it is displayed goes far beyond merely displaying content provided by others.

Of course, there is plenty of room for google attorneys to wiggle, but in the end the objective for them is to 1) give credibility to a source and 2) to get the benefits of being the providers of information.

akersten · on Aug 13, 2019

Common carrier and safe harbor are 100% separate and distinct concepts. The same way that a forum could have a theme ("political party X posts only") and still be allowed to remove illegal content is Safe Harbor (both curation at their discretion and no responsibility for illegal posts) - and I don't see how one could be against that - and Google is nowhere near that, whatever "positions" you envision them to have taken.

Google and other tech never claimed to be common carriers, and even internet service providers have been cleared of that status - barely anyone is legally required to transmit without discretion (it's pretty much just phone companies). So why make it about Google and Twitter, and start with ISPs?

goatinaboat · on Aug 13, 2019

Stay tuned for the next batch of revelations from Project Veritas.

root_axis · on Aug 13, 2019

They're not like either. If anything they're like a phone book for URLs instead of phone numbers.

goatinaboat · on Aug 13, 2019

Except this is a phone book that sorts not alphabetically (no pun intended) but according to its own interests. No phone company ever did that.

root_axis · on Aug 13, 2019

Of course it doesn't sort the internet alphabetically, that'd make no sense and be a bad user experience as well as optimize for URLS starting with A.

goatinaboat · on Aug 13, 2019

I don’t mean literally alphabetically but according to some objective measures. In the old days it was by incoming links (PageRank). But now it is opaque and many people are finding that it orders by whatever is best for Google, not for the user.

root_axis · on Aug 13, 2019

There is not nearly enough room on the front page for everyone that wants to be there, google has to make subjective decisions about what shows up there, it's impossible to do it any other way.

amelius · on Aug 13, 2019

They could randomize it. Allow everybody to be on the front page an equal number of times.

Dylan16807 · on Aug 13, 2019

Then it's spam farms as far as the eye can see, because they can enter a hundred times as much as everyone else.

root_axis · on Aug 13, 2019

I don't think that's a bad idea, but the vast majority of google users would not desire this behavior, especially the way google is used today where users try specific terms to relocate content they have looked up before.

scarface74 · on Aug 14, 2019

It’s called the Yellow Pages. The more money you spent, the more noticeable your business was.

On top of that every Locksmith and towing company had names like “AAAAA Aaron’s Locksmith”.

goatinaboat · on Aug 14, 2019

Yet there is no spam in the Yellow Pages. It’s very unlikely that if you call Aaron he’ll clone your credit card or install hidden cameras in your house. Also, it’s very likely that he actually is a locksmith, has the accreditation he claims to, is a legitimate business registered at Companies House, fully insured, all the things you expect of a normal business.

tylerl · on Aug 14, 2019

Hold on. Google doesn't earn even a penny when you visit their site, find your answer on the search results page, and then leave. That user behavior COSTS Google money, it doesn't earn anything.

If they were trying to monetize you they'd show you an ad that links to your answer and take a profit on the click. Directly giving the user the answer they want is great for the user, but guarantees that Google won't earn any revenue.

So why does Google do it? Simple: because their competitors do. That's the free market for you. Google didn't start that feature, another competitor did; Microsoft made it their primary differentiating feature in fact (remember the "bing and decide" ads?). Google had to adopt the same behavior or lose their customers.

So no, don't blame Google, blame capitalism. This is precisely the kind of feature that you wouldn't get if Google was able to behave as a monopoly.

philipov · on Aug 13, 2019

I think "hypocritical" is a more appropriate description than "Ironic"

OrgNet · on Aug 13, 2019

that is correct and should be considered copyright infringement... I am so tired of the double standard in the US of people VS corporations... Corporations are considered better people then real people.

milesskorpen · on Aug 13, 2019

This is good for users ... for now.

But as Google sucks up the consumer surplus, it's going to be harder and harder to make money from internet businesses, and the final result a few years down the road will be toxic.

The internet isn't going to work too well if its solely reliant on hobbyists.

wolco · on Aug 13, 2019

They could but the hobbists sites are no longer in the serps.

megaremote · on Aug 14, 2019

The funny thing is this used to happen. In the early days, you ask a simple question, you would get the answer in the search results, before they introduced feature snippets. The problem was, because no one was clicking on these useful sites, they were downgraded in the listings to sites that hid the useful info so you had to click on it.

cyrusshepard · on Aug 12, 2019

Perhaps you don't know who Examine.com is, but they are cited by the New York Times, Washington Post, The Guardian, CBC, multiple Wikipedia pages all over the world, and over 12,000 other websites. I assume they did more homework than you.

tus88 · on Aug 12, 2019

> New York Times, Washington Post, The Guardian, CBC, multiple Wikipedia page

I would take one reputable medical journal over all of them.

I mean the MSM covers all sorts of trivial things all the time.

data_required · on Aug 12, 2019

Why would a medical journal cite Examine? Examine just compiles and analyzes research, they don't generate it.

Has any medical journal ever cited any supplements website? What would there be to cite??

AhmedF · on Aug 13, 2019

We could be cited in a letter, but very unlikely as they would reference the primary research itself.

AhmedF · on Aug 12, 2019

I mean, we were straight-up plagiarized.

Which is linked in the article.

cyrusshepard · on Aug 12, 2019

For additional context, Google has "disappeared" 100s of alt-health sites - some bad, but some very good. The site Self Hacked was one, and detailed it here: https://selfhacked.com/blog/google-censorship-of-health-webs...

Some, like Mercola, peddle highly-controversial, near anti-vax content.

But on the other end of the spectrum, Examine.com should be the gold standard. Quality Raters should use it as an example of a site to emmulate. Much higher quality content and informative content than WebMD, IMO.

Scoundreller · on Aug 12, 2019

I’m very happy to hear about mercola being disappearing.

It always rank so highly for so many terms, with complete nutjob advice.

scottlocklin · on Aug 12, 2019

When I google for astaxanthin, the suggestion in the original post, Mercola ranks higher than Examine. Which is bloody tragic, as examine gives legit information.

Someone at Google doesn't like them.

helij · on Aug 12, 2019

Nah, it's not that someone at Google doesn't like them. Mercola is much more 'SEO'ed' than Examine.

qwerty456127 · on Aug 12, 2019

IMHO the mere fact a page links to relevant (to its subject) papers in reputable scientific journals should be considered a positive factor in a page rank computation.

sh1mmer · on Aug 12, 2019

Based on this link I wouldn't trust the critical thinking of the content of this site.

> We also know about Google firing James Damore and their ideological echo chamber.

They are already displaying their own bias based regardless of what they think Goolge's bias is.

Damore was fired because he violated California labor law. But the post continues with lines like:

> Now, I’m not a conspiracy theorist

...

> But it seems like they decided that there’s no way to algorithmically penalize certain sites — so instead they do it manually, behind the scenes, without telling anyone.

Without any evidence of this what-so-ever.

It's one thing to say "Google are not doing a good job of filtering out mis-information/commercialization without penalizing high quality information from smaller sites/institutions" it's another to say that this is a conspiracy from the "echo chamber bias" of their employees to suppress the speech of people who don't agree with them.

gamzer · on Aug 14, 2019

> Damore was fired because he violated California labor law.

From Wikipedia[1] referring to a Guardian article:

> The company fired Damore for violation of the company's code of conduct.

Which is correct?

[1] https://en.wikipedia.org/wiki/Google%27s_Ideological_Echo_Ch...

qwerty456127 · on Aug 12, 2019

Besides SelfHacked and Examine, what are some other good sites that have been censored?

cik2e · on Aug 17, 2019

Google health related results are straight garbage these days. I noticed the change that dropped examine in the rankings as I frequently search for supplements. At the same time l, Wikipedia also tanked in my queries which is one of the sites that I’m almost always going to have a look at if they have a page on the topic. This seems tied into google’s expansion of their built-in snippets. That would be ok if they linked to the source but it feels like that’s true for only 1 in 20 of these boxes.

I’m actually really frustrated with these changes and would like to start using an alternative. I like startpage but they use google search results so that’s not a viable alternative. Guess I’ll have to check out duck duck go’s Bing powered performance.

zuuow · on Aug 12, 2019

So... Google is openly editorialising their results?

They were already doing it with the carousel (google "american inventors") but if they are doing it with what seemingly is the list of organic results this is very, very troubling.

yifanl · on Aug 12, 2019

I mean, Google always was, even in the PageRank days. Even if you can perfectly recreate the numbers as to why so and so site is ranked higher in your system, its still your system choosing to rank so and so site higher.

zuuow · on Aug 12, 2019

I see your point, but in this case they are blacklisting domains by hand because of their content, which they don't agree with. And that is very bad. Maybe it was my mistake, thinking that their organic search was holy, which no longer is the case it seems.

basch · on Aug 12, 2019

I dont understand this point of view. Googles literal mission since inception was to rank results based on how good google thought they were. Their purpose is to editorialize results through the order they appear. Quality is defined buy googles subjectivity.

Where did the idea of google neutrality come from? Google would be useless if they didnt blacklist what they perceive to be spam.

luckylion · on Aug 12, 2019

> Where did the idea of google neutrality come from?

From Google. They've stated time and time again that it's a magic algorithm and they don't hand-pick winners and losers. And it's a good thing, too, otherwise you're just inviting corruption. Top spots are literally worth millions, and if there's an small army of people that decide who ranks where, they are an obvious target for bribes.

This doesn't look that hand picked, though, more like somebody didn't check what would happen if they rolled out some algo change and targeted way too broad.

basch · on Aug 12, 2019

they pick losers by identifying losers, or who SHOULD be a loser, and then modifying the algorithm to derank them and their tactics. believing they can target spam, without first identifying spam, doesnt make any sense.

in this case, some better sites resemble spam enough that they were also hit. a basic false positive, collateral damage.

luckylion · on Aug 13, 2019

That's different though. The algorithm applies to all sites, and if apple.com does the same thing a spam-site does, they will be punished by the algorithm as well. Hand picking is very different, in that similar things aren't treated similarly.

> in this case, some better sites resemble spam enough that they were also hit. a basic false positive, collateral damage.

I believe that as well, though not necessarily because of "spam", but because of the topic. I was just trying to explain how people might think that Google doesn't manually curate their results.

sixothree · on Aug 12, 2019

They have certainly peddled the idea that "the algorithm" is what drives page rank.

basch · on Aug 12, 2019

This is an extremely circular conversation. Google writes the algorithm that ranks pages.

They absolutely know, that if you search Disney, and Disney isnt the first result, they wrote it incorrectly. They also know their product has less value if it returns spam, which is why they fight SEO artists.

They do try to distance themselves from "choosing" the top result for "best construction store" or "best news site" by shouting the world algorithm, to distract the conversation. That doesnt mean they dont carefully craft the algorithm to return a relevant top result.

luckylion · on Aug 13, 2019

> That doesnt mean they dont carefully craft the algorithm to return a relevant top result.

I found https://medium.com/@mikewacker/googles-manual-interventions-... an interesting read on that topic. It's not just a crafted algorithm, but there are different algorithms and employees choose different algorithms for some queries if they/journalists dislike what the original algorithm considered most relevant.

ceejayoz · on Aug 13, 2019

I mean, that's how you'd train the main algorithm, right?

I'd fully expect to see these interventions fed back into the algorithm so Google can better predict "this search term is likely to be targeted by partisan or otherwise suspiciously motivated actors".

yifanl · on Aug 12, 2019

See, I don't think there's much difference between writing a deterministic mathematical algorithm to have X site on top, hand-curating a list to have X site on top or writing a magic spell that consults 4 neural nets, a space dragon from Jupiter and the Canadian Prime Minister for weightings that results in X site on top.

That's all implementation details, at the end of the day site X is on top and site Y is not, and Google decided that.

And as mentioned in the sibling thread, that's the value of Google Search. If you disagree that X should be on top, then find an alternative search engine that has some different ranking algorithm, but there's no such thing as an objective search engine.

stordoff · on Aug 13, 2019

My way of looking at it has always been there is no such thing as organic search results (other than that the organisation in question did not have a hand in choosing to put it there). The aim is, presumably, to return quality sites, which always involves a subjective judgement - there is no intrinsic or natural ordering of a set of sites (other people could choose to rank them differently). Whether it's writing an algorithm that results in those sites being at the top (which _must_ be rewritten if returns certain sites otherwise it will be gamed), a more direct choice, or a combination of the two, there doesn't seem to be much difference. The algorithm serves to scale Google's subjective opinion, and they will always boot out sites they don't think are of sufficient quality - it enables them to return sites that they may not have hand-judged, but I've always assumed it is trying to approximate what would be returned if they _did_ check every site.

ceejayoz · on Aug 12, 2019

> They were already doing it with the carousel (google "american inventors")

Maybe.

There's a chance you're seeing an unintentional side-effect of inclusion-oriented school projects getting hordes of people to Google stuff like "African American inventors".

vxNsr · on Aug 12, 2019

Try googling "black inventors" and then "white inventors" it appears they are directly editorializing.

ceejayoz · on Aug 12, 2019

That potentially shows the same phenomenon - kids given a Black History Month assignment to write a paper on a black inventor - and it's fairly clear there's some sort of automated threshold at work.

As evidence, I really doubt this difference is editorial-based:

https://www.google.com/search?q=purple+musicians (shows the carousel)

https://www.google.com/search?q=purple+inventors (no carousel)

asdf21 · on Aug 13, 2019

>(google "american inventors")

LOL, pretty funny results there.

Nice job though, Edison.

nadezhda18 · on Aug 12, 2019

Many similar websites have suffered from a so-called "Medic" update. Some of them recovered (through fixing their website), some did not. The SEO community is full of such stories, for more than a year I think.

How is it news? :thinking:

tyingq · on Aug 12, 2019

"How is it news?"

Recently piqued interest in antitrust actions for companies like Google, Facebook, and Amazon.

cyrusshepard · on Aug 23, 2018

FWIW, Dan (the author) has an outstanding reputation for professionalism and integrity in the marketing world. If he says he did something for ethical reasons, to those who know him, he's earned the benefit to be believed. (If you don't know him, you'd be forgiven for being suspicious)

And credit should be given to him for educating everyone on this exploit.

tomgp · on Aug 23, 2018

It'd be very easy to make a proof of concept of this exploit which didn't breach copyright or record peoples personal information and then to publicise the problem immediately, instead he chose to operate on real sites, collect real personal data and then forget about it for 5 years. It's this general lax attitude that gives everyone working in the SEO sector -- and by extensions the tech sector as a whole a bad reputation. The whole experiment doesn't feel like it was conducted in good faith or with any consideration for the ethics beyond 'hey this is cool'. Grow up!

dejanseo · on Aug 23, 2018

Thank you Cyrus. I thought it would be obvious that this isn't a practical tactic a reputable brand could risk doing.

jimbauwens · on Aug 23, 2018

While it is clear that you did not have any bad intentions, you should never have published it on the web. Based on your earlier comment "It worked a little too well" it becomes clear that multiple users were tricked by your site and that you possibly even intercepted submitted forms ("I gasped when I realised I can actually capture all form submissions and send them to my own email.").

You misled people and breached their privacy. This is as simple as it gets, even if it was for an experiment (though leaving the site online in some other form still raises a lot of question marks..).

My advice for you is to perform future experiments locally, not on the web and make sure people participating in your experiment are aware.