I've had 4 "Failed - Network error" messages in a row.

waterhouse · on Feb 9, 2014

I have twice created an archive with my Hangouts data, and the archive contained only an errors.html file explaining that Hangouts.json failed with a "Service failed to retrieve this item" message. I left "feedback" saying so. I guess we'll see where this goes.

waterhouse · on Feb 15, 2014

All right, finally got a nonempty archive. Looks like it's probably complete. Seems fairly legit. However, there is one thing that confuses me so far. Looks like a lot of links get a link_target attribute that is different from the actual link:

  {
   "type" : "LINK",
   "text" : "http://en.wikipedia.org/wiki/American_Letter_Mail_Company",
   "link_data" : {
     "link_target" : "http://www.google.com/url?q=http%3A%2%2F\
                      en.wikipedia.org%2Fwiki%2F\
                      American_Letter_Mail_Company\
                      &sa=D&sntz=1&usg=[alphanumeric token scrubbed]",
     "display_url" : "http://en.wikipedia.org\
                      /wiki/American_Letter_Mail_Company"
    }
  }

However, links to Youtube, or Google, or any subdomain of either, have a link_target attribute identical to their display_url. In-teresting. I wonder what that alphanumeric token is. It's like how Google search results also have, rather than normal links, links to google.com/url?[a ton of URL parameters]. I assume the latter is so they can gather data about what URLs are clicked in the search results, or possibly pasted elsewhere, and conceivably to discourage scraping. And as for this?

Perhaps it's a kludge for Google chat clients, which need to parse URLs (they do something special with Youtube links) and might be thus freed to do it stupidly. Perhaps Google wants to know what people do with their downloaded Hangout archives. Perhaps Google wants to know what people do with Hangout history in the browser, and they've changed the links in that archive, and then they just leave it that way in the exported format. --Turns out Hangout history in the browser has exactly those links... I'm guessing it's the last one. Well, at any rate, at least it's easy to ignore that field.

bwillard · on Feb 9, 2014

When you create the Takeout or when you try to download the archive?

jebblue · on Feb 9, 2014

I created it twice successfully last night, I thought maybe the servers are overloaded so I waited to download the second one until this morning. Checked my Gmail, got the your archive download is ready email, clicked the link, logged into Google's site, clicked Download Archive...mine is several hundred megs in size, it gets between 60 and 150 megs downloaded then again this morning, it quits with the error I described.

Maybe there's some bug relating to the size of my download but I'd think a few hundred megs shouldn't be an issue to download.

lutusp · on Feb 9, 2014

Perhaps you should choose a single category (like Gmail) of your Google content, download that, then go back and download another single category until you have it all.

jebblue · on Feb 9, 2014

It would probably work but it's only a few hundred megs I'm talking about. I logged onto my Linode and created a 588 meg file with this:

  dd if=/dev/zero of=testing bs=196k count=3k oflag=dsync

Then I downloaded it and it went fine. So the problem is on Google's side.

The fact that I tried downloading the first archive so many times their service told me finally that I'd downloaded it too many times then the second archive failed around the same rough time frame. Their service is counting the failed downloads as successful, that's bad in itself.

GFischer · on Feb 9, 2014

I've been similarly unable to download my archive as well :(