A. below is a list of OpenAI initial hires from Google. It's implausible to me that there wasn't quite significant transfer of Google IP
B. google published extensively, including the famous 'attention is all you need' paper, but open-ai despite its name, has not explained the breakthroughs that enabled O1. It has also switched from a charity to a for-profit company.
C. Now this company, with a group of smart, unknown machine learning engineers, presumably paid fractions of what OpenAI are published, has created a model far cheaper, and openly published the weights, many methodological insights, which will be used by OpenAI.
1. Ilya Sutskever – One of OpenAI’s co-founders and its former Chief Scientist. He previously worked at Google Brain, where he contributed to the development of deep learning models, including TensorFlow.
2. Jakub Pachocki – Formerly OpenAI’s Director of Research, he played a major role in the development of GPT-4. He had a background in AI research that overlapped with Google’s fields of interest.
3. John Schulman – Co-founder of OpenAI, he worked on reinforcement learning and helped develop Proximal Policy Optimization (PPO), a method used in training AI models. While not a direct Google hire, his work aligned with DeepMind’s research areas.
4. Jeffrey Wu – One of the key researchers involved in fine-tuning OpenAI’s models. He worked on reinforcement learning techniques similar to those developed at DeepMind.
5. Girish Sastry – Previously involved in OpenAI’s safety and alignment work, he had research experience that overlapped with Google’s AI safety initiatives.
> A. below is a list of OpenAI initial hires from Google. It's implausible to me that there wasn't quite significant transfer of Google IP
I agree there's hypocrisy but in terms of making a strong argument, you can safely remove your list of persons who (drum roll)... mostly _didn't_ actually work at Google?
I think this project is awesome and am quite disappointed with some cynical commentary from large American labs.
Researcher at Meta or OpenAI spending hundreds of millions on compute, and being paid millions themselves, whilst not publishing any of their learnings openly, here a bunch of very smart, young Chinese researchers have had some great ideas, proved they work, and published details that allow everyone else to replicate.
"No “inscrutable wizards” here—just fresh graduates from top universities, PhD candidates (even fourth- or fifth-year interns), and young talents with a few years of experience."
"If someone has an idea, they can tap into our training clusters anytime without approval. Additionally, since we don’t have rigid hierarchical structures or departmental barriers, people can collaborate freely as long as there’s mutual interest."
Lendable | Data Science & Software Engineering |Hybrid & Remote | UK, London | Full Time
Lendable have built the big three consumer finance products from scratch: loans, credit cards and car finance. Help us improve the best-in-class credit analytics that powers credit-decisioning.
mixture density networks are quite interesting if you want probabilistic estimates of neural. here, your model learns to output and array of gaussian distribution coefficient distributions, and mixture weights.
these weights are specific to individual observations, and trained to maximise likelihood.
This approach characterizes a different type of uncertainty than BNNs do, and the approaches can be combined. The BNN tracks uncertainty about parameters in the NN, and mixture density nets track the noise distribution _conditional on knowing the parameters_.
Adding one to both the numerator and the denominator when calculating average ratings isn’t a terrible idea.
In situations where you’re estimating probabilities (like the average rating of an item based on user reviews), there is a Bayesian interpretation of this adjustment.
Beta distribution is a conjugate prior to the binomial distribution, meaning the update process has a posterior distribution is also a Beta distribution.
By adding one to the numerator (the number of positive reviews) and one to the denominator (the total number of reviews plus one), you’re effectively using a Beta(1,1) prior (uniform distribution).
This approach smooths the estimated average, especially for items with a small number of reviews. This is a useful regularisation, pulling extreme values towards the mean and reflecting the uncertainty inherent in limited data.
If you withhold a small amount of data, or even retrain on a sample of your training data, then isotonicregression is good to solve many calibration problems.
I also agree with your intuition that if your output is censored at 0, with a large mass there, it's good to create two models, one for likelihood of zero karma, and another expected karma, conditional on it being non-zero.
I hadn't heard of isotonicregression before but I like it!
> it's good to create two models, one for likelihood of zero karma, and another expected karma, conditional on it being non-zero.
Another way to do this is to keep a single model but have it predict two outputs: (1) likelihood of zero karma, and (2) expected karma if non-zero. This would require writing a custom loss function which sounds intimidating but actually isn't too bad.
If I were actually putting a model like this into production at HN I'd likely try modeling the problem in that way.
Did you dictate this? It looks like you typo'd/brain I'd "centered" into "censored", but even allowing for phonetic mistakes (of which I make many) and predictive text flubs, I still can't understand how this happened.
I was thinking of censoring, maybe I should have said another word like floored.
The reason I think of this as censoring is that there are are some classical statistical models that model a distribution with a large mass at a minimum threshold, e.g. "tobit" censored regression.
Thanks for the explanation. I never paid much attention in my stats lectures so I deserve to have missed out on that term-of-art. I think the physics lingo would be to call it "capped" or "bounded" or "constrained".
ValueError at /accounts/signup
The given username must be set
Request Method: POST
Request URL: http://www.rashomonnews.com/accounts/signup
Django Version: 2.2.10
Exception Type: ValueError
Exception Value:
The given username must be set
Exception Location: /home/deployer/newsbetenv/lib/python3.5/site-packages/django/contrib/auth/models.py in _create_user, line 140
Python Executable: /home/deployer/newsbetenv/bin/python
Python Version: 3.5.2
Python Path:
['/home/deployer/rashomon',
'/home/deployer/rashomon',
'/home/deployer/newsbetenv/bin',
'/usr/lib/python35.zip',
'/usr/lib/python3.5',
'/usr/lib/python3.5/plat-x86_64-linux-gnu',
'/usr/lib/python3.5/lib-dynload',
'/home/deployer/newsbetenv/lib/python3.5/site-packages']
Server time: Mon, 28 Oct 2024 16:36:54 +0000
One thing that can be useful with sampling is sampling a consistent but growing sub population. This can help maintain a consistent holdout for machine learning models, help you sample analytical data and test joins without null issues etc.
If you use a deterministic hash, like farm_fingerprint on your id column (e. g user_id) and keep if modulus N = 0, you will keep the same growing list of users across runs and queries to different tables.
My understanding of why bagging works well is because it’s a variance reduction technique.
If you have a particular algorithm, the bias will not increase if you train n versions in ensemble, but the variance will decrease as more anomalous observations won’t persistently be identified in submodel random samples and so won’t the persist in the bagging process.
You can test this. The difference between train and test auc will not increase dramatically as you increase number of trees in sklearn random forest for same data and hyperparameters.
A. below is a list of OpenAI initial hires from Google. It's implausible to me that there wasn't quite significant transfer of Google IP
B. google published extensively, including the famous 'attention is all you need' paper, but open-ai despite its name, has not explained the breakthroughs that enabled O1. It has also switched from a charity to a for-profit company.
C. Now this company, with a group of smart, unknown machine learning engineers, presumably paid fractions of what OpenAI are published, has created a model far cheaper, and openly published the weights, many methodological insights, which will be used by OpenAI.
1. Ilya Sutskever – One of OpenAI’s co-founders and its former Chief Scientist. He previously worked at Google Brain, where he contributed to the development of deep learning models, including TensorFlow. 2. Jakub Pachocki – Formerly OpenAI’s Director of Research, he played a major role in the development of GPT-4. He had a background in AI research that overlapped with Google’s fields of interest. 3. John Schulman – Co-founder of OpenAI, he worked on reinforcement learning and helped develop Proximal Policy Optimization (PPO), a method used in training AI models. While not a direct Google hire, his work aligned with DeepMind’s research areas. 4. Jeffrey Wu – One of the key researchers involved in fine-tuning OpenAI’s models. He worked on reinforcement learning techniques similar to those developed at DeepMind. 5. Girish Sastry – Previously involved in OpenAI’s safety and alignment work, he had research experience that overlapped with Google’s AI safety initiatives.