As others noted the datasets are not really standardized even with the SEC Edgar data so there is a lot of massaging you have to do.
A system that does that for you would not really be a trading bot per se, it would just be a general algorithm for "picking stocks". Automating the actual purchasing is probably unnecessary.
If you find one, let us know! Most investors in the world are searching for the such a thing.
Actual long term investors today are looking at an additional two billion people by 2050, increased demand for food and water, and regional destability due to climate change.
Long term investors today are buying land and resource access about the globe, or moving to secure such things via private contractor | mercanary armies.
China has purchased one in four US pigs (the farms, the feed, the processing), the Saudis have locked in access to large quantities of US aquifers, and Eric Prince wants the US to retake Africa: https://theintercept.com/2024/02/10/erik-prince-off-leash-im...
These are all examples of securing access to water and food resources to ensure supply into the long term.
The investment payoff of is having those resources when others don't, being secure in what you need and being able to profit from what you don't in times of extreme demand.
You don't need trading bots for long term investing or even infrequent trading. In LT investing, portfolio tracking and asset allocation/reallocation are the primary tasks. Robo-advisors were very popular almost a decade ago. Most brokerages have integrated such features now. Also, checkout M1 Finance.
I started investing first with the help of spreadsheet then shell scripting and now Jupyter Notebooks and Python. Beyond LT investing portfolio tracking, majority of time I spend on short to mid-term strategy development, back-testing and implementation; portfolio hedging and leverage; and options trading.
Only manual aspect is actual order placement, which takes only few minutes at best.
This platform, allows one to do automated trading based on your own strategy. US only traders, for now. https://www.composer.trade
If you are just doing portfolio re-balancing. Say, twice a year. You could re-balance based on each stock's risk parity.
i.e. Risk parity is an approach to investment portfolio management which focuses on the allocation of risk, rather than the allocation of capital. The risk parity approach asserts that when asset allocations are adjusted to have the same level of risk, the portfolio can achieve a higher risk-adjusted return.
Some Quant Resources: https://quantpedia.com
They teach a class on quant. Pretty good. Python oriented. https://quantscience.io
Case in point, my framework for mining companies is here: https://emergingtrajectories.com/a/pub/mining_company_risk_f... You can see the scores here: https://emergingtrajectories.com/c/copper_mining_companies
"Long term" -- we'll see, I expect to hold positions for 12-24 months.
For those interested, my work above is influenced by two important books: "You Can Be a Stock Market Genius Even if You're Not Too Smart" by Joel Greenblatt and "Superforecasting: The Art and Science of Prediction" by Philip Tetlock. The idea from Joel's writing is to look for less liquid or less popular asset classes (or ones that structurally can't be invested in by the pros who are smarter/better-resourced than you), and Tetlock really drills process and research for long-term forecasting.
I haven't looked at it in a while, but it was promoted heavily on some podcasts I listened to years ago when it came out.
Edit:
Ah, I just realised you might have means software part more than the financial part. Shyam does publish R code of various things on GitHub
Why?
What would be the point? HFT works because you can beat the market by being faster, I don't see how long term trading could beat the market unless you have insider information.
And if you can't beat the market, there is absolutely no point in the bot, as you can trivially just buy an index fund tracking the market. Which is also what I am doing, I would never use a bot over that, as it is just additional risk.
You are talking about 2 different things in your post though, I believe: 1 - automating long term investmenets (this is the Revolut thing) - ie, setup an amount you set for investment every month - and it automatically buys whatever you want 2 - a research tool ? (not a bot though)
Or, it just hit me while writing, are you talking about quantitatives strategies ?? If yes, then yeah, half of Wall Street was working on that ! There were some open source attempts, I think the best known was Quantopian - look it up.
When I graduated college, I spent 3 months as a programmer with my econ friend trying to build exactly this. I started off creating a system to paper trade stocks retroactively. So you imagine you go back in time and pretend it's January 1st, 1982 then have an algorithm look at the stocks then, then move it a day forward, and let it trade for the past 40 years and see how it does.
We tried linear models, SVMs, neural networks, RNNs, ensembles, genetic algorithms, anything with stock data, news sentiment data, classic quant structures, and everything in-between. Basically, 3 solid months of coding before I started working.
Anyway, I found out a lot of stuff the hard way, because I didn't have an econ degree.
First off, you try enough methods, you end up p hacking or hill climbing the past anyway, and it's no good.
Second off, historical clean data is hard to get. It may or may not have splits in it or other things, so you may inadvertantly supply information from the future when playing back from the past. It's hard to get this right.
Third off, for many of the models we used, they were almost always competitive in the 80s (even a linear regression), but in the oughts or 2010's, they stopped being competitive. We thought computer based trading was becoming more competitive in hedge funds.
Fourth, simple models tended to work better. So for instance we may have trained the model on data from 70s-80s, then starting in the 80s, we did online (continuous) training as we moved the model forward in time. There's just not enough data. You can train on all historical stocks or all stocks or related data streams in the industry up to that point, but I think we probably didn't have enough data and the market is competitive.
Fifth, I wish I read a Random Walk Down Wall Street earlier, or all of Taleb's stuff. These are books that have deep mistrust of quants.
Sixth, I think to be competitive, you need to have money in the game, many heuristics, and industry experience. Big firms have this and equipment, but it's hard to get in as an individual.
Seventh, I put several hundred hours into this project and learned a bunch about machine learning and economics. In every way I loved the experience, and I'd encourage you to try it. Probably I'm a n00b here, but I hope some of my notes can help you.