Hacker Newsnew | past | comments | ask | show | jobs | submit | genidoi's commentslogin

Atomic clock non-expert here, what does having a fleet of atomic clocks entail and why would the hyperscalers bother?

Having clocks synchronized between your servers is extremely useful. For example, having a guarantee that the timestamp of arrival of a packet (measured by the clock on the destination) is ALWAYS bigger than the timestamp recorded by the sender is a huge win, especially for things like database scaling.

For this though you need to go beyond NTP into PTP which is still usually based on GPS time and atomic clocks


Actually interesting to think about what UTC actually means and there is seems to be no absolute source of truth [0]. I guess the worry is not that much about the NTP servers (for which people anyways should configure fail overs) but the clocks themselves.

[0] https://www.septentrio.com/en/learn-more/insights/how-gps-br...


Could you define an absolute source of truth based on extrinsic features. Something like taking an intrinsic time from atomic sources, pegged to an astronomic or celestial event; then a predicted astronomic event that would allow us to reconcile time in the future.

It might be difficult to generate enough resolution in measurable events that we can predict accurately enough? Like, I'm guessing the start of a transit or alignment event? Maybe something like predicting the time at which a laser pulse will be returnable from a lunar reflector -- if we can do the prediction accurately enough then we can re-establish time back to the current fixed scale.

I think I'm addressing an event that won't ever happen (all precise and accurate time sources are lost/perturbed), and if it does it won't be important to re-sync in this way. But you know...


Spanner depends on having a time source with bounded error to maintain consistency. Google accomplishes this by having GPS and atomic clocks in several datacenters.

https://static.googleusercontent.com/media/research.google.c...

https://static.googleusercontent.com/media/research.google.c...


And more importantly, the tighter the time bound, the higher the performance, so more accurate clocks easily pay for themselves in other saved infrastructure costs to service the same number of users.

There's a lot of focus in this thread on the atomic clocks but in most datacenters, they're not actually that important and I'm dubious that the hyperscalers actually maintain a "fleet" of them, in the sense that there are hundreds or thousands of these clocks in their datacenters.

The ultimate goal is usually to have a bunch of computers all around the world run synchronised to one clock, within some very small error bound. This enables fancy things like [0].

Usually, this is achieved by having some master clock(s) for each datacenter, which distribute time to other servers using something like NTP or PTP. These clocks, like any other clock, need two things to be useful: an oscillator, to provide ticks, and something by which to set the clock.

In standard off-the-shelf hardware, like the Intel E810 network card, you'll have an OXCO, like [1], with a GPS module. The OXCO provides the ticks, the GPS module provides a timestamp to set the clock with and a pulse for when to set it.

As long as you have GPS reception, even this hardware is extremely accurate. The GPS module provides a new timestamp, potentially accurate to within single-digit nanoseconds ([2] datasheet), every second. These timestamps can be used to adjust the oscillator and/or how its ticks are interpreted, such that you maintain accuracy between the timestamps from GPS.

The problem comes when you lose GPS. Once this happens, you become dependent on the accuracy of the oscillator. An OXCO like [1] can hold to within 1µs accuracy over 4 hours without any corrections but if you need better than that (either more time below 1µs, or more accurate than 1µs over the same time), you need a better oscillator.

The best oscillators are atomic oscillators. [2] for example can maintain better than 200ns accuracy over 24h.

So for a datacenter application, I think the main reason for an atomic clock is simply for retaining extreme accuracy in the event of an outage. For quite reasonable accuracy, a more affordable OXCO works perfectly well.

[0]: https://docs.cloud.google.com/spanner/docs/true-time-externa...

[1]: https://www.microchip.com/en-us/product/OX-221

[2]: https://www.u-blox.com/en/product/zed-f9t-module

[3]: https://www.microchip.com/en-us/products/clock-and-timing/co...


I don't know about all hyperscalers, but I have knowledge of one of them that has a large enough fleet of atomic frequency standards to warrant dedicated engineering. Several dozen frequency standards at least, possibly low hundreds. Definitely not one per machine, but also not just one per datacenter.

As you say, the goal is to keep the system clocks on the server fleet tightly aligned, to enable things like TrueTime. But also to have sufficient redundancy and long enough holdover in the absence of GNSS (usually due to hardware or firmware failure on the GNSS receivers) that the likelihood of violating the SLA on global time uncertainty is vanishingly small.

The "global" part is what pushes towards having higher end frequency standards, they want to be able to freewheel for O(days) while maintaining low global uncertainty. Drifting a little from external timescales in that scenario is fine, as long as all their machines drift together as an ensemble.

The deployment I know of was originally rubidium frequency standards disciplined by GNSS, but later that got upgraded to cesium standards to increase accuracy and holdover performance. Likely using an "industrial grade" cesium standard that's fairly readily available, very good but not in the same league as the stuff NIST operates.


GPS satellites have their own atomic clocks. They're synchronized to clocks at the GPS control center at Schriever Space Force Base, Colorado, formerly Falcon AFB. They in turn synchronize to NIST in Boulder, Colorado. GPS has a lot of ground infrastructure checking on the satellites, and backup control centers. GPS should continue to work fine, even if there's some absolute error vs. NIST. Unless there have been layoffs.

> There's a lot of focus in this thread on the atomic clocks but in most datacenters, they're not actually that important and I'm dubious that the hyperscalers actually maintain a "fleet" of them, in the sense that there are hundreds or thousands of these clocks in their datacenters.

I mean, fleets come in all sizes; but if you put one atomic reference in each AZ of each datacenter, there's a fleet. Maybe the references aren't great at distributing time, so you add a few NTP distributors per datacenter too and your fleet is a little bigger. Google's got 42 regions in GCP, so they've got a case for hundreds of machines for time (plus they've invested in spanner which has some pretty strict needs); other clouds are likely similar.


Considering that you chose to not include your name or even a HN username in the byline of the article, there is an argument to be made that you are, in fact, hiding from it.


Also three uses of a semi-colon for no reason. Nobody writes like this.

> The log is the truth; the order book is just a real-time projection of this sequence.

> The book is fast; the log is truth.

> Matching engines can crash; the log cannot.


I need to sharpen my BS sensor. At first glance, I struggled to parse the voice in the article. Going back to the article I can now see the obvious gaps. Generally, AI tends to say things that make us go - "what the hell is this" ? for example in the article "The Problem: Ordering Chaos" is a very weird way to phrase it. As a human I struggled to accept it, and I did by stretching the meaning of that phrase to world model where it made sense. ie, our tendency is to give wide leeway to what we read or see and be very "accepting" in that sense. Instead, i think a better option is to reject everything we see or read as the default.


I didn’t catch it either on the first pass but also felt something was off about the article, as if a human had sanitised most of the AI idiosyncrasies out.

Now I have taken note to auto-distrust any “article” that lacks an author name, who is willing to personally own any accusations of the article being AI slop.


This is an interesting observation. One possible explanation for a lack of robust first class table manipulation support in mainstream languages could be due to the large variance in real-world table sizes and the mutually exclusive subproblems that come with each respective jump in order-of-magnitude row size.

The problems that one might encounter in dealing with a 1m row table are quite different to a 1b row table, and a 1b row table is a rounding error compared to the problems that a 1t row table presents. A standard library needs to support these massive variations at least somewhat gracefully and that's not a trivial API surface to design.


TUI libraries have sufficiently abstracted away the low-level quirks of terminal rendering that the terminal has become something like a canvas[0] available in the IDE with no extensions. This is quite a nice DevX if you want to display the state of an app that does something to data, without writing the necessary plumbing to pipe that data to a browser and render it.

[0] https://github.com/NimbleMarkets/ntcharts/blob/main/examples...


They did this in the 1970s and 1980s too, then they were called “forms libraries” but were often full application frameworks in ways that would be familiar to modern developers of native graphical apps.


TurboPascal springs to mind because I know someone who made a video store management system with all kinds of forms and screens (via Turbo Vision[0]) in the early 90s.

[0] https://en.wikipedia.org/wiki/Turbo_Vision


The low-level terminal stuff is still grody as hell. Years ago, HN had some blogposts from someone who was rethinking the whole stack, but I dunno what happened to that project. If people really like TUIs, eventually they're going to stop doing the 1980s throback stuff.


It's still around. Still doing its thing. One developer drafted a backend to ratatui for it, but he's been silent lately. I'm only marginally interested in that angle as its endgame "just" lands in TurboVision but Rust! and having to stay compatible with the feature-set of terminal emulation defeats the point.




Imo the fact that an "AWS Certified Solutions Architect" is yet another AWS service/thing that is attainable, via an actual exam[0] for $300, is indicative of just how intentionally bloated the entire system has become.

[0] https://aws.amazon.com/certification/certified-solutions-arc...


> where I feel so disconnected from my codebase I'd rather just delete it than continue.

If you allow your codebase to grow unfamiliar, even unrecognisable to you, that's on you, not the AI. Chasing some illusion of control via LLM output reproducibility won't fix the systemic problem of you integrating code that you do not understand.


Who cares about the blame, it would just be useful if the tools were better at this task in many particular ways.


It's not blame, it's useful feedback. For a large application you have to understand what different parts are doing and how everything is put together, otherwise no amount of tools will save you.


The process of writing the code, thinking all the while, is how most humans learn a codebase. Integrating alien code sequentially disrupts this process, even if you understand individual components. The solution is to methodically work through the codebase, reading, writing, and internalizing its structure, and comparing that to the known requirements. And yet, if this is always required of you as a professional, what value did the LLM add beyond speeding up your typing while delaying the required thinking?


I completely agree.


> is absolutely cruel.

> What a horrible thing

They offered 6 months severance which dispels any serious notion of 'cruelty'. Substance over form.


Sure I stabbed you, but I gave you a million dollars!


The problem with using random real world situations as analogies for niches within Software Engineering is that they're not only (almost) ways wrong, but always misrepresentative of the situation in it's entirety


Our entire profession is “how can we make thing difficult enough to not be used incorrectly”

That applies from user experience: “how do I get user to click button”, to security “how do I lock things down enough to prevent most attacks I can think of”, to hardware design: “how do I ensure the chipset won’t melt down under really stupid software conditions”

Starting with the low hanging fruit isn’t always the worst option. Sometimes it’s enough to dissuade people to give up.


> But honestly what are the examples of people losing their jobs to software?

And furthermore, what is the full causality chain that links the precise PR in provided example software to the employment termination decision? Lacking that, can you really assert the software 'automated' a dev out of the job?


Honeslty I would even accept statistical test (but conclusive, no p-hacked bullshit). Yet I just dont see examples of devs being automated out of a job.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: