Dspy has 17k stars, meanwhile PyReft (https://github.com/stanfordnlp/pyreft) isn't even at 1200 yet and it has Christopher Manning (head of AI at stanford) working on it (see their paper: https://arxiv.org/abs/2404.03592). Sometimes what the world deems "impactful" in the short-medium term is wrong. Think long term. PyReft is likely the beginning of an explosion in demand for ultra parameter efficient techniques, while Dspy will likely fade into obscurity over time.
I also know that the folks writing better samplers/optimizers for LLMs get almost no love/credit relative to the outsized impact they have on the field. A new sampler potentially improves EVERY LLM EVER! Folks like Clara Meister or the authors of the min_p paper preprint have had far larger impacts on the field than their citation counts might suggest, based on the fact that typiciality or min_p sampling is now considered generally superior to top_p/top_k (OpenAI, Anthropic, Gemini, et al still use top_p/top_k) and min_p/typicality are implemented by every open source LLM inference engine (i.e. huggingface, vllm, sglang, etc)
Picking something useful in 1-2 years is a reason to go to industry, not research, and leads to mostly incremental units that if you don't do, someone else will. Yes, hot topics are good because they signal a time of fertile innovation. But not if your vision is so shallow that you will have half of IBM, Google, and YC competing with you before you start or by the time of your first publication (6-12mo). If you are a top student, well-educated already, with top resources and your own mentees, and your advisor is an industry leader who already knows where your work will go, maybe go to the thickest 1-2 year out AI VC fest, but that's not most PhD students.
A 'practical' area would be obvious to everyone in 5 years, but winnowing out the crowd, there should not be much point to it today nor 1-2 years without something fundamentally changing. It should be tangible enough to be relevant and enticing, but too expensive for whatever reasons. More fundamental research would be even more years out. This gives you a year or two to dig into the problem, and another year or two to build out fundamental solutions, and then a couple years of cranking. From there, rinse-and-repeat via your own career or those of your future students.
Some of my favorite work took 1-2 years of research to establish the problem, not just the solutions. Two of the projects here were weird at first as problems on a longer time scale, but ended up as part of $20M grant and software many folks here use & love, and another, a 10 year test of time award. (And another, arguably a $100M+ division at Nvidia). In contrast, most of my topic-of-the-year stuff didn't matter and was interchangeable with work by others.
Edit: The speech by Hamming on "You and your research" hits on similar themes and speaks more to my experiences here: https://fs.blog/great-talks/richard-hamming-your-research/
1. The paper represents such an impressive leap in performance over existing methods in AI, that it is obviously impactful. Unfortunately, this way of generating impact is dominated by industry. No one can expect Academia to train O1, SAM, GPT5 etc. AI rewards scale, scale requires money, resources and manpower and Academia has none. In the early days of AI, there were rare moments when this was possible, AlexNet, Adam, Transformers, PPO etc. Is it still possible? I do not know, I have not seen anything in the last 3 years and I’m not optimistic many such opportunities are left. Even validating your idea tends to require the scale of industry.
2. The paper affects the thought process of other AI researchers and thus you are indirectly impactful if any of them cause big leaps in AI performance. Unfortunately here is where Academia has shot itself in the foot by generating so many damn papers every year (>10,000). There are just so many, that the effect of any 1 paper is meaningless. In fact the only way to be impactful now is to be in a social circle of great researchers, so that you know your social circle will read your paper and later if any of them make big performance improvements, you can believe that you played a small role in it. I have spoken to a lot of ML researchers, and they told me they choose papers to read just based on people and research groups they know. Even being a NeurIPS spotlight paper, means less than 10% of researchers will read your paper, maybe it will go to 50% if it’s a NEURIPS best paper but even that I doubt. How many researchers remember last year’s NEURIPS best paper?
The only solution to problem 2, is radical. The ML community needs to come together and limit the number of papers it wide releases. Let us say it came out and said that yearly only 20 curated papers will be widely published. Then you can bet most of the ML community will read all 20 of those papers and engage with it deeply as they will be capable of spending more than a day at least thinking about the paper. Of course you can still publish on arxiv, share with friends etc but unless such a dramatic cutdown is made I don’t see how you can be an actually impactful AI researcher in Academia when option 1 is too expensive and option 2 is made impossible.
This is excellent advice, and in my experience does not represent the intuition that many young (and not so young) researchers begin with.
Papers come from projects and, if you care, good projects can yield many good papers!
There does seem to be a strong incentive to publish whatever and distribute the credit among dozens of people.
For the rare actually impactful research the advice is a bit trivial, you might as well quote Feynman:
1) Sit down.
2) Think hard.
3) Write down the solution.
> First, the problem must be timely. You can define this in many ways, but one strategy that works well in AI is to seek a problem space that will be 'hot' in 2-3 years but hasn't nearly become mainstream yet.
I think about research as, what can I bring that's unique? What can I work on that won't be popular or won't exist unless I do it?
If it's clearly going to become popular than other people will do it. So why do I need to? I'm useless.
Yes. You'll increase your citation count with that plan. But if you're going to do what other people are doing go to industry and make money. It seems crazy to me to give up half a million dollars a year to do something obvious and boring a few months before someone else would do it.
> "Today, the solitary inventor, tinkering in his shop, has been overshadowed by task forces of scientists in laboratories and testing fields. In the same fashion, the free university, historically the fountainhead of free ideas and scientific discovery, has experienced a revolution in the conduct of research. Partly because of the huge costs involved, a government contract becomes virtually a substitute for intellectual curiosity. For every old blackboard there are now hundreds of new electronic computers. The prospect of domination of the nation's scholars by Federal employment, project allocations, and the power of money is ever present — and is gravely to be regarded."
Unless you are independently (very) wealthy, you'll have to align your research with the goals of the funding entity, be that the corporate sector or a government entirely controlled by the corporate sector. You may find something useful and interesting to do within these constraints - but academic freedom is a myth under this system.
> "If you tell people that their systems could be 1.5x faster or 5% more effective, that's rarely going to beat inertia. In my view, you need to find problems where there's non-zero hope that you'll make things, say, 20x faster or 30% more effective, at least after years of work."
This works for up-and-coming fields, but once something is stable and works at large scale, it's all about the small improvements. Making petrol engines 1% more fuel-efficient would be massive. Increasing the conversion rate of online ads by 1% could make you very, very rich indeed. Good advice for AI probably; bad advice in other fields.
> "Invest in projects, not papers"
The best way I think you can go about this is allocate some fraction alpha of your time to projects, and (1-alpha) to things that produce short-term papers. Alpha should never be zero if you want a career, but it will start out small as you begin your PhD and gradually grow, if you can make it in academia. At some point you'll reach a compounding return where the projects themselves are spawning papers - one way to do this is to get to the point where you can hire your own PhD students, but there are several others.
As long as your 2-years-into-a-PhD review as some unis have them is about how many papers you've published (somehow weighted by journal/conference rank) and how many others are in the pipeline, you need to focus on papers until the point when your institution will let you do something more useful. Think of it as paper writing bootcamp so that once you do get more time for projects, you'll have practiced how to write up your results.
> "Make your release usable, useful ..."
This is excellent advice, also for anything else related to code.
I'm not a researcher, but have thought about doing a PhD in the past.
It's probably a lot more nuanced that this. Show progress but don't make it easily accessible. *Hide* something important for yourself. Kind of like modern day "open source".
So, here are some points and comments I offer that go in a slightly different direction (although, like I said, if you managed to get there, congrats!):
* You can write a good paper without it being a good project. One thing does not exclude the other, and the fact that there are many bad papers out there does not mean that papers themselves are bad. You can plan your work around a paper, do a good research job, and write a good scientific report without having to have an overarching research project that spills over that. Sure, it is great when it happens (and it will happen the more experienced and senior you get), but it's not necessarily true.
* Not thinking about the paper you'll write out of your work might deter you from operationalizing your research correctly. Not every project can be translated into a good research paper, with objective/concrete measurements that translate to a good scientific report. You might end up with a good Github repo (with lots of stars and forks) and if that's your goal, then great! But if your goal is to publish, you need to think early on: "what can I do that will be translated into a good scientific paper later?" This will guide your methods towards the right direction and make sure you do not pull your hair later (at least not as many) when you get rejected a million times and end up putting your paper in a venue you're not proud of.
* Publishing papers generates motivation. When a young research goes too long without seeing the results of their work, they lose motivation. It's very common for students to have this philosophical stance that they want to work on the next big project that will change the world, and that they need time and comfort and peace to do that, so please don't bother me with this "paper" talk. Fast forward three years later they have nothing published, are depressed, and spend their time playing video games and procrastinating. The fact is that people see other people moving forward, and if they don't, no amount of willpower to "save the world" with a big project will keep them going. Publishing papers gives motivation; you feel that your work was worth it, you go to conferences and talk to people, you hear feedback from the community. It's extremely important, and there's no world where a PhD student without papers is healthier and happier than one with papers.
* Finishing a paper and starting the next one is a healthy work discipline. Some people just want to write a good paper and move on. Not everyone feels so passionate about their work that they want to spend their personal time with it, and push it over all boundaries. You don't have to turn your work into your entire life. Doing a good job and then moving on is a very healthy practice.
I hope they are first asking, to which bank accounts is the research actually making a difference? It's a great fraud to present research problems to stimulate the intellect of the naive youth when they have no capability to assess its social impact.
I can't spend a year on topics that seem interesting but that might not yield papers if they don't work. From the bureaucratic point of view, which is almost all that matter for junior researchers, that would be simply time in the bin.
I would love to spend years on something I care about without caring how many papers it will generate, but if I do that, I won't have a career.