Empirical vs Generative Data
The last decade was about evaluating empirical data. The next decade is about evaluating generative data. Synthetic data itself isnāt the goal, understating it is.
The last decade was about evaluating empirical data. The next decade is about evaluating generative data. Synthetic data itself isnāt the goal, understating it is.
This is a TIL of a TIL. Simon Willison wrong of using an NVIDIA DGX Spark with an Apple Mac Studio for faster inference, with all the details here. He talks about the two phases of inference: Prefill: Influence Time-To-First-Token (TTFT) Decode: Influence Tokens Per Second (TPS) Prefill Read the prompt Build a Key-Value Cache for each transformer layer in the model Bound by compute as it initializes the modelās internal state and does a lot of matrix multiplication. ...
The Demo-to-Product Gap? The March of nines? In AI, the last 0.001% takes 99.999% of the time. We need a name for it, and given how much the Andrej & Dwarkesh interview has been cited, itās only fair to call it āThe Karpathy Principleā. Here is a 30 second YouTube Clip for reference.
A Read Through On Venting I have a close friend to whom I vent every once in a while. He recently shared a post he wrote called On Venting. I immediately understood why he responds to all of my non-venting messages and pretends as if the vents never happened: When folks interact with me, I often skip past the emotional complaints. I like to think I do the following, but itās probably only true half the timeāat best: ...
Founders create problems. Operators solve them. Iāve met a lot of executives and CTOs who resonate with two things: They like to solve puzzles. They like to be given an end goal and just make it happen. The job of a founder is to identify opportunities (i.e. problems), set a vision for what a solution looks like, and build a team thatās enabled to get there. That idea, of course, comes with plenty of qualifiers and caveats. It simply defines a founderās end goal, not their day-to-day. ...
I can never see myself writing code by hand again. It feels like doing multiplication and division with large numbers before there was a calculator. While the agents are working, Iām reviewing, editing, and staying in context. Basically, Iām only typing to think, not to do.
Onboarding engineers is hard. You need to set up your development environment, build context, navigate the codebase, navigate the organization, understand the system, the product ā and the list goes on. I used to believe that the best way to onboard an engineer on day one is to make sure they: Get their development environment set up Make a pull request to the codebase Merge and deploy to production You get an immediate hit of dopamine and experience the end-to-end flow of shipping a feature or fixing a bug. ...
Task: Product Founder Fit: Building domain expertise is not hard, but entrenching oneself in the culture of an existing industry takes time. Patience: Patience with results but impatience with action. Complimenting onself: Kind of like relationships, I learnt a lot about myself. Marketing. Advertising. Selling. Domain expertise. Etc. My skillset is about ideation, strategic direction, execution, planning, organization, inspiration. I really have to get better at selling and building relatinoships. Managing emotional intelligence: For myself and for others This is a tl;dr and I really wish I had spent more time writing in the past.
Thereās a quote I heard a little while ago that really stuck with me: If you are not a liberal when you are young, you have no heart. If you are not a conservative when you are old, you have no brain. Following up my recent post about politics, I have a complementary version to this: If you are a young engineer who doesnāt hate politics, you have no heart. If you are an old engineer who doesnāt hate politics, you have no brain. ...
Feedback shouldnāt be a one-time or regularly scheduled event. Itās a constant daily process that should happen just-in-time. It should be critical and unemotional. The only rule is that you should praise more than you criticize. In short, remember to praise as often as possible, when deserved, and criticize as soon as possible, when necessary.
One thing Iāve changed my mind on recently is the importance of politics. When I worked as an individual contributor at large companies, I hated the bureaucracy and politics. Donāt get me wrong, I still do. But Iāve come to realize that politics is not a blocker, itās a lever you can pull. If you can learn to navigate it, it is a lot more powerful than any engineering work. ...
Fools ignore complexity. Pragmatists suffer it. Some can avoid it. Geniuses remove it. ā Alan Perlis
Chris Dixon once said: āCome for the tool, stay for the network.ā The modern version of that is: āCome for the tech, stay for the community.ā It never seizes to amaze me the power of a strong community. Chris Dixonās post from 2015
Stripe is building a new blockchain: Tempo. Meta tried the same ~5 years ago with Libra/Diem. An overlooked point when comparing the two is what theyāre focusing on⦠First, a primer on 2 core blockchain concepts: Consensus: how you agree on a set of transactions Execution: how you process those transactions Meta ā Consensus: Designed & built DiemBFT (HotStuff-based) Optimized for scalable, permissionless consensus Execution layer ā MoveVM Spurred Aptos Labs, Sui Network, 0L Network (use DiemBFT and MoveM) Stripe ā Execution: ...
Computer Science degrees are more relevant than ever before. In the 2010s, new grads said: āNothing I learned applies to work.ā You studied CS, but did software engineering. Today, AI does most of the software engineering for you. To know what AI can and cannot do requires understanding the fundamentals. A CS degree isnāt a must, but it certainly helps.
Feynmanās Problem Solving Playbook: Write down the problem Think very hard Write down the answer AI-era Feynmanās Problem Solving Playbook: Type out a good prompt Think very hard about the output Repeat (1) & (2) until you get the answer
One of the biggest mistakes in AI evals is treating them as objective truth. Benchmarks and leaderboards are a great signal, but they are not universally applicable. Think SATs or job interviews: directionally correct, but not a guarantee of on-the-job performance. Results depend on context, environment, operations, and collaboration. And just like with people, the more you actually work with a particular LLM, the more results start to compound.
I hear a lot of conflation between RPCs and APIs. Letās fix that. RPC is the channel two computers use to talk. Think of it as the phone line š API is the contract for how they talk. Think of it as the language š£ļø
KISS: Keep It Simple, Stupid. But remember: simple isnāt easy.
Work in an environment that teaches you what your limits but doesnāt push you past them. Work-life integration isnāt about balance or 996. Itās about learning your limits, growing them over time, and flirting with the boundary.
Amazing read showing what a single engineer can do when you actually care about your work. Loved the post because it also showcased what the real world looks like. A balance of planning and hacking. Some decisions were data driven, while others were more intuitive. Diving deep into some issues, while letting go of others never to understand the root cause. This is what the real world looks like. And lastly, another signal that Postgres is likely going to solve all your problems unless you have really big data.
Some feedback I got from an anonymous friend. You donāt necessarily need 100K labels depending on the domain Data labeling for complex tasks starts to approximate knowledge work which comes with interesting new angles for ācollaborative labelingā Combining human labels with automated methods is important
This post convinced me to go down the Claude Code Learning Curve. Thanks for all the insights! Posting my #ActiveReading notes below. -ā Experience is an excellent school; unfortunately, the fees are very high.ā Let me share the fruits of my experience so you donāt have to pay the high fees. Loved this quote. If Claude does something you donāt like, donāt just correct it once ā ask it to update the CLAUDE.md file so it remembers for next time. ...
Below is a very terse and opinionated set of steps to data labeling and model evaluation.
tl;dr Teams should build their own LLM tooling, starting with GitHub PR Description generators.
Definitely Worth a read (pun intended).Good perspective on how much momentum a certain geography has when the flywheel kicks off.Itās near impossible to reverse so need to get it right from the start. Letās not fuck up the software industry.Critique - the article is great but, for me, a bit too dense with facts, dates, names and timelines.
@mtorygreen Iām current the CTO at Grove and Head of Protocol at Pocket Network. Weāre launching a full rewrite of Pocket early next year (a migration fully rewritten using the latest Cosmos SDK). There are some easy lifts we can do to support io.net relaying via pocket then. Would love to chat when youāre open to it.
Thank you for going through the pain (and documenting it) so we donāt have to!
The Road to a Personal Computer In the mid-90s, my family got their first computer. I vividly remember the Windows 95 logo loading up on the screen as I booted a computer up for the first time. Later that day, two things happened: I visited lego.com for about 5-10 minutes. I remember asking why thereās a limit to how long I could spend on the web š I loaded up a floppy disk with Doom II. This experience scarred me from playing first-person shooters for almost a decade š ...
This is a post in a series of articles Iām writing called ā5 points & 1 resourceā (think tl;dr but 5p;1r), where I summarize a list of 5 concepts that would have helped me start learning or re-learning a certain topic. It is intentionally far from a complete source of data. I recently came by this one reference on the nuances between an individual, their insurance plan and a drug company from an article titled *āIs My Drug Copay Coupon a Form of Charity ā Or a Bribe?ā. *I thought it was important enough to be summarized in 5 points below. ...
tl;dr Configure your router to disable IPv6 and use a 2.4 GHz WiFi band if youāre running into the same issue.
Great article! Just wanted to provide my own tl;dr below āŖ MPT Ingredients : Merkle Tree + Prefix Tree + Custom Modification āŖ MPT Nodes : Leaf nodes + branch nodes + extension nodes āŖ State root : 256 bit hash of root āŖ Geth KV Store : levelDB āŖ Parity KV Store : rocksDB āŖ Tries : 1. Transaction Trie - mapping from transction hash ā raw transactions 2. Transaction Reciept Trie - mapping from transaction hash ā transaction execution Metadata ...
I put together a small sequence diagram using mermaid to help visualize this.
Could you split this into two PRs, please? Could you please approve this so I can merge it in? Why are you implementing XXX using A rather than B? NIT: extra space In the context of code reviews, Iāve found myself on both the giving and receiving end of these types of comments more than once. I must say that the followingĀ tweetĀ still rings true today: Having worked on production systems at large companies, internal systems at mid-size companies and most recently joining a small and agile team, I increasingly realize that the purpose of code reviews depends on the stage and size of both the team and project. For example, for a mature, production-grade, critical system at a large established company: ...
Reading this statement made me think why fiction, in the form of films, TV, books, games, plays and more have such a big place in ourā¦
When I read the title, I was thinking the article would discuss how everyone uses and interacts with it through centralized services;. This is true as both a user (etherscan.io, opensea.io) or a developer (infura.io). I agree with Nateās comment that ethereum contracts are indeed public and immutable. Whether an individual chooses to sync a node, inspect the code and understand what functions are public and which can be run only by the contract owner to modify state, it is up to them. ...
For lack of a better word, 2020 has been very interesting for financial markets. Volatility hit all-time highs, some fortunes were wipedā¦
I agree that your reply almost warrants an article of itās own :) To be honest, the fact that Embed.ly works with multi-file gists when you specify the file name, it really makes me wonder what (and why) medium is doing on the backend? The proxy provider solution seems viable but a bit of an overkill IMO if itās possible to avoid. Itād be great if Medium Engineering could look at this, and ideally, even avoid the preprocessing step thatās making things complicated. š¤that they respond.
One of my favorite features of elixir is being able to start a shell that loads the entire context of my project: $ iex -S mix It provides easy access to all of the projectās modules so you could easily iterate on your code by compiling it directly from within the shell. You have the option to recompile the whole project or just a single module: # single module $ r MyModulesNameSpace.MyModule # whole project $ recompile The other great thing about elixir is how easy it is to run unit tests: ...
This is a great article and a great exploration. Your pros and cons list sums everything up, but Iām was curious youād every leverage this Postgres features for a system you use for work that has to be maintable in the long-term?
āMap data is only available to an app that is running on a device physically inside the location where the map was built. We check both coarse (WiFi or GPS) location and using precise visual feature and sensor data to match whether you are in the physical location.ā If I stand outside the door of someone who is running the app inside, wouldnāt I be able to get a sparse map cloud of their house?
Fees are a metric worth looking into but I would say that the story they tell is more of how sustainable the network is as opposed to the adoption rate by the general public. Theyāre actually growing at a much higher rate than I had expected! I think this is healthy given how BTC will probably be a settlement network and lightning networks will be used for everything else. I agree with you regarding spam attacks and other one-off events. ...
I think itās worth noting that in order to use await, the function directly enclosing the test code must be marked as async. Helpful resource: https://stackoverflow.com/questions/42299594/await-is-a-reserved-word-error-inside-async-function.
Iām having some trouble with the truffle console. When I type āVoā and click tab, it autocompletes to Voting as expecting. When I type āVoting.dā and click tab, it redeploys all the contracts rather than autocompleting to āVoting.deployedā as expected. Mahesh Murthy, could you confirm whether this is the expected behaviour or is it potentially some sort of weird configuration on my end?
I like the price ceiling and floor approach but see two issues with it. If weāre selling a DAO token with equal voting rights, and a majority is required to make some decision, then someone could come in and purchase the equivalent of the tokenās market cap + ε. They would then be able to make an authoritative decision. If they choose to sell the tokens after the decision had been made, theyād have effectively paid (market cap + ε -num_tokens_purchase * price_floor) to make that decision. From the point of view of an investor, being paid by the beneficiaries is great for a company like Apple that pays dividends. However, if the company/beneficiary can make better use of their income by reinvesting it rather than paying it out to the token/share holders, then price appreciate of the token/share is how that value gets realized. Amazon is a great example of this. In the safe token model, the role of price appreciation would be determined by the sale administrator rather than market forces, which seems kind of oddā¦
While a correction is currently taking place, and itās impossible to foresee what will happen over the coming months, my opinion is that the realm of possibilities with blockchain technology is still in itās infancy. I donāt think weāll see real stability until itās adopted by mainstream organizations, which could very well take at least ten years. By then, the market cap of all cryptocurrencies, in my opinion, will far exceed a trillion dollars.
Balaji S. Srinivasan One of the projects you omitted that I think is very noteworthy is Aragon. Itās built on top of ethereum and helps you manage a company (token issuance, vesting schedules, etcā¦) without all the intermediaries.
Reading this felt like watching a Pixar movie: the humbling protagonist with a dream, embarking on a long journey, training really hard insert montage here, with both highs and lows, making new friends and connections along the way, ultimately culminating in a happy but slightly different than expected ending. You should legitimately consider selling the rights to your story. You said your goal was to tell a story through dance, and in a way you managed to do it by writing this piece. Your passion for the art and the candidness of your words took me on a wilder emotional rollercoaster than I had originally expected. ...
Personally, I think weāre entering an era where acquisitions will be more common than IPOs. The bubble is not bursting, but itās not growing at the same rate as it used to, and thereās not much more air to fill it. VC money is drying up, and seeing how most startups still canāt turn a profit but have valuable IP, an acquisition is a good out for shareholders and investors. ...
Great piece! My takeaway was that as long all your needs are met, money just acts as insurance. Nothing changes when you get home / auto / health insurance unless an unlikely situation arises, in which case it really could prove useful.
I couldnāt agree more. Having grown up in a reform Jewish household, I always enjoyed the holidays and just having a reason to get together as a community and celebrate. Arguably, you donāt need a reason to celebrate, but tradition kind of served as a calendar that everybody worked their schedules around. However, I think that religion itself is an outdated concept in a modern society. Like you said, religion serves to provide a strict set of rules to live by. This was very important and necessary in the early days of human civilization when no prior rules existed, but that is no longer the case. We now have a central government, laws, order (more or less), and those who still abide by rules from centuries ago hinder themselves from reaching their full potential.
A walk from the subway station ā inspired by The Hunger Games Prior to my senior year in College, I made my way through the Hunger Games Trilogy. I read all three books back-to-back in a relatively short period of time. Having spent so much time in Suzanne Collinsā world and writing, I felt as if her writing style had rubbed off on me at the time. On my first day of my senior year in College, I felt an overwhelming sensation to write something as I exited the subway. It took approximately 7 minutes to walk from the subway to the building where most of my classes were held, during which I wrote the piece below. The content is utter nonsense, but I think it really captures writing style and essence of the hunger games in some way. ...