Ask HN: Has anyone successfully pivoted from web dev to AI/ML development?

simonw

Do you want to train models from scratch, or do you want to build cool things on top of AI models?

If the former, I suggest digging into things like the excellent Fast AI course: https://course.fast.ai/

If the latter, the (relatively new) keyword you are looking for is likely "AI Engineer" - https://www.latent.space/p/ai-engineer

There's an argument that deep knowledge of how to train models isn't actually that useful when working with generative AI (LLMs etc) - knowing how to train or fine-tune a new model is less useful that developing knowledge of the other weird things you have to figure out about prompting, evals and using these models to build production-quality apps.

phillypham

It's not too uncommon. I started off working with Angular and Java. But I studied math.

It depends on what type of role you want. If you'd be happy building the application layer and doing prompt engineering, just build applications that call LLM APIs.

If you want a research position at the top labs, the interviews really are actually passable by people without PhDs. They are really focused on having strong fundamentals. I've seen people make this leap but it can be years of preparation. Like actually reading textbooks, implementing low-level details like backprop, re-implementing papers, and doing non-trivial personal projects. Essentially, you're self-studying a Masters degree. Blog about it. Post about it here. I've found people to make this transition just generally love learning.

sevensor

AI problems turn into data problems. The happiest and best compensated people I know in that area have gone into data engineering, because data engineers are the ones selling shovels in this gold rush.

gnarcoregrizz

Yes, AI team was created in our company to bring it in house, and I was invited mainly to integrate it into the web app, and to do some ops work. A year later and I'm fine tuning models, building datasets, working with PyTorch. Much different than webdev, not as rewarding sometimes, more unknowns, longer feedback cycles. The main issue is getting enough data quality and quantity, which can be a grind. Happy to have taken this opportunity though. Endless things to learn.

hectormalot

Totally possible. About 30-40% of the people in our AI team don’t have a formal AI background. Especially with LLMs a lot of work has shifted towards “data literate software engineering”. We call them AI engineers / AI developers. Good development skills are very transferable to those roles.

Feel free to reach out if you’re in the EU (email in profile), we’re hiring. Also happy to give some pointers on how to approach these conversations.

ilaksh

AI, ML, and data science are all different things. And there are different types of jobs in each of those categories.

If you want to apply AI, there are lots of really useful projects that are just calling the Anthropic or OpenAI API for the AI part. Or replicate.com image models etc. That wasn't the case a few years ago before we had the general purpose models. I have been doing a lot of those types of projects and I don't have a machine learning background.

There are ML Ops jobs that don't require a lot of machine learning knowledge.

There are ML researcher jobs that are just training LLMs which are more practical rather than theory.

To do novel machine learning research or at least significant variations of popular neural network architectures, I think that is the only thing that really requires years of study. But I think there is a very large gap between that type of work and web development. Which is why I was very happy to see the progress in general purpose models.

TbobbyZ

This post shows why programming as a career overall sucks. Sure it’s great if you really enjoy programming. However, staying relevant to earn a decent living your entire life is difficult.

Atheb

Not sure if my experience is relevant, but I did a couple of internships in web dev during my bachelors degree in CS and quickly realized it wasn't for me. I then did a masters and now a PhD in medical imaging where I extensively use machine learning (design and train my own models, doing both supervised and RL) but I wouldn't say I am a researcher in AI/ML.

Because I am still in the academic process, I had the opportunity to take a couple of classes on the subject. Three books that I would recommend going over to make sure your foundation in ML and mathematics are solid are

-Pattern recognition and machine learning by Christopher Bishop

-Mathematics for Machine Learning by Peter Deisenroth

-Deep Learning by Courville, Bengio and Goodfellow

All three are legally available online in some form. I can't say I have any experience in finding a job related to ML though.

spmurrayzzz

There is so much in the ML world that's being built in the open which you can use as a foundation for your learning. Some folks here already mentioned Jeremy Howard's fast.ai course which is absolutely a great place to start for anyone that a) has motivation to learn DL, and b) already knows how to code.

Beyond that, there are hundreds of open source projects you can fork to start building intuitions around what the inner loop of AI dev project cycles look like. You'll be surprised at how much of your web dev skills remain relevant in these projects, particularly in UI-related tasks. Data scientists and folks of similar ilk default to notebooks, gradio, streamlit et al. to ship interfaces for their experiments. You though have the ability to do that on your own if you choose to (sometimes a notebook is enough), which can be a valuable differentiator for you as a candidate if you also have all the other skills needed to be productive in this space.

My own background is in distributed systems with some full stack and embedded work mixed in over the years. I started tinkering with ML projects back in 2012 when I first discovered AlexNet and resources were far more limited. I was still able to get productive relatively quickly even though most of what I built wasn't really applicable to my work in a practical sense on day one. Where my background became relevant was when I needed something approximating an MLOps pipeline for data processing, training, and eval. Most of the code you're writing for that isn't really specific to ML, its nearly identical to CI/CD systems but with the obvious infrastructure caveats native to ML workloads.

Nowadays though, especially if you're intrepid/resourceful, there is so much more learning material by comparison and much of what you work on can likely augment your day-to-day web dev tasks as well if you're creative enough.

throwawayyyyhhh

Yes. Use your CS and math background to get a job in Operations Research. Use the OR techniques to do V&V on applications of AI/ML models.

Ie, Verification that the application/model do the job correctly and Validation that the app/mdl is appropriate for the business case.

This is not exciting but it pays well and people with the skill set are extremely needed.

Vox_Leone

I made this transition. I am developing a project in the area of computer vision -- with intensive use of drones -- to serve Brazilian agribusiness. At the moment the challenge is to gather a dataset of images of the livestock breeds and crop varieties most produced and cultivated in the region, for detection, classification, monitoring, counting. The field of work is new and promising here. From a technical and professional point of view, it wasn't particularly difficult to make the transition, because I have an extensive background in data analysis and science. The difficult part is working on sales with a new type of customer.

From where I stand Computer Vision seems like a really good area to start in machine learning. Good luck!

gigawattz

For me, the shortest path i found from full-stack Web Dev to full-on AI Rockstar was this one YouTube video: https://www.youtube.com/watch?v=Yhtjd7yGGGA which clarified for me, in just 40 minutes, exactly how to 1) leverage my knowledge of Postgres into getting my hands dirty tokenizing a huge repository of any content into pure magical numerical data inside a vector db, 2) apply my years of experience building local site search UIs into the actual nuts & bolts of transforming a search query into an effective AI prompt, and 3) do all that on a cheap little ec2 server without a single GPU in sight! The video is over a year old now (!) and there are more recent updates and similar how-to's elsewhere, but this short, practical, and honestly amazing story of a guy crushing on the perfect first use case for AI on a website spoke directly to my web dev soul, and really demystified all the AI hype for me, so much more so than the firehose of AI-wrapper content ever could!

breckenedge

Here’s a thread also asking about this from last week: https://news.ycombinator.com/item?id=40797858

omneity

One workable pathway is to join a team working on AI products as a full-stack engineer. AI practitioners notably are less inclined to do frontend or even backend work, even more so when talking about SWE best practices and moving code to production. So there is plenty for you to offer given your 10 years tenure in dev.

Then learn by symbiosis, while having AI on your resume :)

Feel free to get in touch if you want to chat more about this.

ivylee

I've mentored senior software developers transitioning to ML - everyone comes from slightly different background and experience. Feel free to schedule an intro call to discuss your specific situation https://cal.com/studioxolo/intro.

ent_superpos

Hi all. Wow! I'm blow away by the wealth of responses this post has gotten. I truly appreciate all of you sharing that knowledge. I especially appreciate the separation of disciplines that have been explained.

I think I would probably fit more into the AI Engineer category since I would have to do a lot more study for AI research, but I do enjoy trying to use existing models and libraries to accomplish tasks. I can also create toy models myself (I actually built a ConvNet in PyTorch to detect popup dialogs on my screen and alert me), but I'm no where near good enough to create entirely new novel approaches or architectures.

foweltschmerz

> I'd love to find a way to work on AI stuff professionally

Working on AI can mean many different things.

If you're looking to pivot into a more research-y position in DS and/or AI research, I'd suggest getting an advanced degree in these fields.

If you're talking more about ML engineering, there're full stack software engineer positions at some startup AI companies that require an assortment of skill sets such as web development, MLOps, and sometimes a bit of data engineering. You could look into these roles. Alternatively, since ML engineer is still an emerging position at a lot of organizations, some do not require prior experience but instead focus more on the candidate's portfolio. Create some projects and build a strong portfolio, you might have a good chance.

jmartin2683

This is essentially what I’ve done, having started 15 years ago with Ruby on Rails and now leading development of most ML-related applications at my current job. I’ve been interested in ML since 2017 or so, just found it interesting.

j45

Successfully pivoting means learning and shipping things yourself on GitHub and talking about it on your own YouTube journal.

Specifically with AI/ML the urge might be to start from scratch but I think there might be enough tooling to start where you are with web, build solutions using existing AI tech, start customizing it and going deeper and deeper.

Makes for a natural story and journey too. It’s completely choose your own adventure so you and direct the vector of your path where you want and learn & build in that direction.

mistrial9

zooming out, the AI space is filled with insiders trading well-paid opportunities among others with very specific credentials.. in a high stakes and heavily constricted network of networks..

contrast this to an open market.. for example I make a poster for a grocery store in my town, that poster is well-liked and I have the rights to reproduce it, or make a similar new one. In every town there could be such activity, and for larger towns, many people could do that activity. That naturally scales for participation, with the transactions of pay and consumption in an open market environment.

The AI space seems much more like building large projects in tight teams with serious resource requirements. The end products are more varied than most people realize, but there is a common thread of replacing skilled humans in jobs with some kind of automation, or extracting value from humans with monitoring and some kind of enforcement. In other words, really contrasting to the open markets ideas.

Honestly I cannot be enthusiastic to put the word "job" and "AI dev" in the same sentance. The real-world dynamics appear to be coalescing into high powered, competing silos, with a side-effect of replacing jobs in some cases.

PaulHoule

I've gone back and forth between them, but it helps that I have a Physics PhD so I am not intimidated by the math.

I got my PhD in 1998, did a postdoc in Germany for a year, came back to the states, started doing remote work and consulting projects for web sites, worked on the arXiv preprint server for a few years, then worked on a pretty wide range of projects for pay and for side projects until I got interested in using automation to make large image collections on my own account circa 2008 or so.

I had a conversation with my supervisor that called into question whether I could ever be treated fairly where I was working and then two days later I got a call from a recruiter who was looking for a "relevance architect" which had me work for about a year and a half for a very disorganized startup. Then I got called by another recruiter who needed somebody to finish a neural network search engine for patents based on C++, Java and SIMD assembly.

After that I tried to put a business to develop a next-generation data integration tool and did consulting projects, learned Python because customers were asking for it. When I gave up on my own business I went to work full-time for a startup that was building something similar to the product I had in mind as a "machine learning engineer". That company was using CNNs for text, I had previously worked for one using RNNs, that summer BERT came out and we realized it was important but not quite so important.

After that I wound up getting a more ordinary webdev job where I can actually go to an office, I still do ML and NLP-based side projects though.

Funny enough I am working on text analysis projects now that I first conceived of 20 years ago, I think technologically some of them could have worked but they work so much better now with newer models.

---

My take is that the average 'data scientist' is oriented towards making the July sales report, not making a script that will make the monthly sales report. If you want to get repeatable results with ML it really helps to apply the same kind of organizational thinking and discipline that we're used to in application development. Also I believe getting training data is the bottleneck for most projects: I mean, if you have 5000 labeled examples and a 20 year old classification model you might get a useful classifier, you can get a much better classifier with a two year old model with little more work, or you can try a model out of last week's arXiv paper and spend 10-100x the effort, risk complete failure, and probably add 0.03 points to your ROC.

If you don't have those 5000 examples on the other hand all you can do is download some model from huggingface and hope it is close enough to your problem to be useful.

My spurt of doing front-end heavy work built up my UI skills so I have done a lot of side project work towards building systems that let people label data.

throwaway6734

I made the transition a few years ago by going back to school for my MS. This allowed me to re-enter the internship pipeline

ruffrey

I pivoted from "full stack" to a software and infra engineer on an AI team. There was some luck involved, but I believe it helped that I'd taken some courses. I am not one of the data scientists, but work very closely with them. This seems like a good path to full time data science.

extr

As other commenters have noted, there are different kinds of ML/AI archetypes out there:

- "Real AI Scientist/Applied Scientist/Researcher" aka you do actual training/fine tuning of bleeding edge models. Very hot right now but competition is incredibly intense. Probably you need a PhD or some serious experience to compete. Get ready to do a bunch of independent learning if you're serious about this.

- "Fake AI Scientist/Applied Scientist/Researcher" - You work in a big corporation or maybe a confused startup who wants to staff out some internal AI teams but doesn't really have true expertise in the area. Maybe if you really knock it out of the park something you build will provide real customer value...one day.

- "Real AI-ML Engineer" Scientist work under a different name, or deployment/infra for custom models. More approachable than Real Scientist work but probably more focused on engineering chops, C++, CUDA, etc. Similar to Real Scientist in that you need to have some actual legit skills.

- "Fake AI-ML Engineer" Calling the OpenAI API and massaging the output into something that is possibly valuable but more likely is just a "AI Feature" on top of an existing application that provides real customer value.

- "Non-AI ML Engineer" You work with traditional ML like xgboost, probably in the financial world, and don't really interact with any of this stuff, unless your boss asks you to create a new AI Feature. You can now put this on your resume and hope to get a Fake AI-ML Engineer job, if you want to.

- "Real Data Science" this role is trending toward inhabiting more of a BI/Analytics space. IMO in Big Tech they are getting more serious about the stats/probability background here. In some ways I think this would be more difficult to upskill on than Deep Learning math, if you're starting without a math background.

- "Fake Data Science" Kind of a dying role, this is like "I just learned how to do pandas and scikit learn and I'm creating linear regressions for boomers". Honestly still some alpha here if you are a product-focused person in the right org. But maybe the title here should be more like Data Analyst++

Hope this helps. Me myself I'm a Non-AI ML Engineer who is pretty screwed, because if you search ML Engineer now everyone wants you to know PyTorch.

srinikhilr

Anyone has advice for engineers working on distributed systems problems pivoting to MLOps/ AI Infrastructure?

fardinahsan146

Conversely, I'm a ML guy who has to do webdev to make ends meet. Tips for me?

syngrog66

This should be in the Who Wants To Be Hired thread

shrimp_emoji

Woah, out of the cringepan and into the cringefire!

SkyMarshal

To begin, start hacking on open source AI projects, put them on your github, build up that track record. Get involved in various FOSS AI communities and collaborate with folks there [1]. The opportunities to transition to AI jobs will follow.

[1]:https://reddit.com/r/localllama