Last Updated on November 3, 2020
Table of Contents
- What is yfinance?
- Is the yfinance library free?
- Why should I use the yfinance library?
- Why shouldn’t I use the yfinance library?
- What are some of the alternatives to the yfinance library?
- How do I get started with the yfinance library?
- How do I download historical data using the yfinance library?
- How do I download fundamental data using the yfinance library?
- How do I download trading data using the yfinance library?
- How do I download options data using the yfinance library?
- Common errors
- Final thoughts
- Link to download code used
What is yfinance?
yfinance is a popular open source library developed by Ran Aroussi as a means to access the financial data available on Yahoo Finance.
Yahoo Finance offers an excellent range of market data on stocks, bonds, currencies and cryptocurrencies. It also offers market news, reports and analysis and additionally options and fundamentals data- setting it apart from some of it’s competitors.
Yahoo Finance used to have their own official API, but this was decommissioned on May 15th 2017, following wide-spread misuse of data.
These days a range of unofficial APIs and libraries exist to access the same data, including of course yfinance.
Note you might know of yfinance under it’s old name- fix-yahoo-finance, since it was re-named on May 26th 2019 at the same time that it went over a large overhaul to fix some usability issues.
To ensure backwards compatibility, fix-yahoo-finance now imports and uses yfinance anyway, but Ran Aroussi still recommends to install and use yfinance directly.
In this article we will focus mainly on the yfinance library, but we discuss the overall range of options and other alternative providers in more depth in our parent article, Yahoo Finance API – A Complete Guide.
Is the yfinance library free?
Yes, yfinance is completely open source and free. You can find the documentation here.
Why should I use the yfinance library?
- Free
- Quick and easy to set yourself up
- Simple
- High granularity of data (1min/2min/5min data)
- Returns data directly in pandas dataframes/series
As we have just mentioned yfinance is completely open source and free. There are other ways to access the Yahoo Finance data, some free and some paid, and there are certain benefits to some of the options that require paying, like being ensured a degree of maintenance to the solution, but everybody loves free!
Installation couldn’t be quicker or easier. yfinance has just 4 dependencies, all of which come with Anaconda anyway, and installs fully in a single line of code. No account creation required, or signing up for and using API keys!
Its simple. yfinance is highly Pythonic in it’s design and incredibly streamlined. It’s as easy as creating a ticker object for a particular ticker/list of tickers and then just calling all the methods on this object. Like this:
import yfinance as yf
apple= yf.Ticker("aapl")
# show actions (dividends, splits)
apple.actions
# show dividends
apple.dividends
# show splits
apple.splits
# + other methods etc.
Don’t worry, we’ll break down that code further in a bit!
Furthermore, the documentation is concise- fitting on a single page, and the method names are very self explanatory.
High granularity of data. One cool feature of yfinance is that you can get highly refined data, all the way down to 5 minute, 3 minute and even 1 minute data! The full range of intervals available are:
1m, 2m, 5m, 15m, 30m, 60m, 90m, 1h, 1d, 5d, 1wk, 1mo, 3mo
However it is important to note that the 1m data is only retrievable for the last 7 days, and anything intraday (interval <1d) only for the last 60 days.
yfinance also handily returns data directly in padas dataframes or series. This is on contrast to some other options to access Yahoo Finance’s data where you will get lengthy JSONs you need parse for the specific information you want, and will have to manually convert to data-frames yourself.
» Here are some alternative (mostly) free data sources and guides:
Why shouldn’t I use the yfinance library?
- Lacks specialised features
- Some methods are fragile
- Unofficial / not necessarily maintained
- Can get yourself rate limited/blacklisted
Lacks specialised features. Despite the fact you can use it to get a good range of core data, including options and fundamentals data, yfinance doesn’t provide a method to scrape any of the news reports/analysis that are available on Yahoo Finance.
This obviously isn’t ideal if you want to build model that relies in part on sentiment analysis, so if you want that sort of data, you might want to check out RapidAPI (which will talk about more shortly) that does offer such data.
Also, other market data alternatives often include interesting extras. For example Alpha Vantage provides modules that calculate various technical analysis indicators for you- obviously an enormous effort save if you want to build an algorithm utilising any of them! yfinance just provides the basics.
Some methods are fragile. yfinance mainly makes API calls to Yahoo Finance to gather it’s data, but it does occasionally employ HTML scraping and pandas tables scraping to unofficially gather the information off the Yahoo Finance website for some of it’s methods. As such, the functionality of some of it’s methods is at the mercy of Yahoo not changing the layout or design of some of their pages. In fact, yfinance is widely known to already have a few issues.
As a quick aside, data scraping works by simply downloading the HTML code of a web page, and searching through all the HTML tags to find the specific elements of a page you want.
For instance below is the Yahoo Finance Apple (‘AAPL’) historical data page:
If the method to get the historical data HTML scraped, it would be searching the various div, class and tr tags etc. for various IDs to pick out the data that should be returned.
For instance the class ID “Py(10px) Pstart(10px)” refers to the historical prices populating the table. If in this case Yahoo Finance was to change the class ID pointing to this value, the method might return completely incorrect data, or even nothing at all. Again, this sort of vulnerability doesn’t apply to all of yfinance’s methods- most of them do in fact make direct API calls- but it does affect a few.
It’s an unofficial solution. Again, because yfinance is simply the result of one man’s hard work and not in any way affiliated with Yahoo Finance, there’s no guarantee if it breaks it will be maintained.
As we already mentioned it did have a big update to fix issues on May 26th 2019 on the same day it was renamed, but that’s no guarantee problems will be fixed in the future. Are you sure you want to build a trading algorithm on-top of data that might one day suddenly and without warning be wrong? There are already a few known issues with yfinance, which we will highlight later on in this article.
You can get yourself rate limited/blacklisted. Again because yfinance scrapes data for a few of it’s functions, you sometimes run the risk of getting rate limited or blacklisted for too many scraping attempts.
This is a risk that’s always present when trying to scrape websites, but when you’re building applications trading real money on-top of infrastructure that might be making a lot of data requests, the risk:reward changes.
Conclusion
Overall yfinance an incredibly beginner friendly option. You’ll be able to dive right in and test out ideas without wasting time puzzling over complex documentation whilst still having access to a good range of data!
That said, the risk of getting faulty data or being blocked from getting any data at all when employing algorithms trading real money is absolutely unacceptable.
We think yfinance is great for prototyping, or if you are beginner, or just want to download a bunch of historic data.
But if you want complete confidence that a serious trading system is going to function with total reliability, we’d absolutely recommend going with a official and alternative market data provider- preferably one claiming to provide low latency data directly from exchanges.
Polygon and IEX might make good bets.
What are some of the alternatives to the yfinance library?
RapidAPI
Of the two alternatives to yfinance we will consider, RapidAPI is the most distinct.
Firstly, whilst it does still have a limited usage free tier, you will have to pay for anything over 500 requests per month:
Secondly, its not quite as simple as yfinance to get started with. You will have to sign up for an account to get your own access API keys.
That said, a big plus of RapidAPI is that you can use it with 15 different languages, if for some reason Python isn’t your thing:
It also offers more range of data than our other options, specifically the option to download market news and analysis which is fantastically useful if you want to add a degree of sentiment analysis in your model!
Making snap trading decisions based on machine scanning of news far faster than a human ever could can be one way (if slightly uncertain) to gain a trading edge.
That said RapidAPI does have a few drawbacks.
As you can see requests have an average latency of 1660ms which isn’t terrible, but alternative data providers such as polygon.io offer anything from 200ms down to 1ms delays- quite the difference.
More concerning is the fact requests only have a 98% success rate. Having 1 in 50 data requests fail could be a big deal if you have a system trading real money, especially if you are making a lower frequency of calls. Definitely something to consider.
Results returned can also be in quite lengthy and nested JSONs, making the data a bit trickier to get ready for use than when using yfinance:
That said a further plus of RapidAPI is that it offers a huge range of APIs for other purposes, so familiarising yourself with how to use the their API for Yahoo Finance data might carry over into easily using another of their APIs for a different project in the future.
In summary, RapidAPI offers a very limited free tier, but perhaps by using a solution where some people are paying, it is more likely that any scraping issues from Yahoo Finance structure changes are resolved more quickly.
Its also fiddlier to use and harder get started with, but does provide a bigger range of data than our other two options.
yahoo_fin
yahoo_fin is an open source and free library similar to yfinance.
You can find the documentation here.
It offers a similar range of data to yfinance, but notably has a few functions that generate all the tickers for certain markets for you:
- tickers_dow()
- tickers_nasaq()
- tickers_other()
- tickers_sp500()
which is a useful feature yfinance lacks.
We actually focus on the yahoo_fin library in the example sections of our parent article, Yahoo Finance API – A Complete Guide, so we won’t talk about it anymore here.
How do I get started with the yfinance library?
Installation
Getting started with the yfinance library is super easy.
It has the following dependencies:
- pandas >= 0.24
- numpy >= 1.15
- requests >= 2.21
- multitasking >= 0.0.7
These all come as standard in an installation with Anaconda, but are really easy to install manually if for some reason you don’t have them.
After that its as easy as:
$ pip install yfinance --upgrade --no-cache-dir
or
$ conda install -c ranaroussi yfinance
to install yfinance.
Library Layout
The layout itself is also really simple, there are just three modules:
- yf.Tickers
- yf.download
- yf.pandas_datareader
Almost all the methods are in the Tickers module.
The download module is for rapidly downloading the historical data of multiple tickers at once.
And pandas_datareader is for back compatibility with legacy code, which we will ignore as irrelevant since if you’re reading this you are probably a new user of the library!
How do I download historical data using the yfinance library?
Demo with one ticker
Firstly, lets import yfinance as yf and create ourselves a ticker object for a particular ticker (stock):
import yfinance as yf
aapl= yf.Ticker("aapl")
aapl
yfinance.Ticker object <AAPL>
Remember we now use this aapl ticker object for almost everything- calling various methods on it.
To get the historical data we want to use the history() method, which is the most “complicated” method in the yfinance library.
It takes the following parameters as input:
- period: data period to download (either use period parameter or use start and end) Valid periods are:
- “1d”, “5d”, “1mo”, “3mo”, “6mo”, “1y”, “2y”, “5y”, “10y”, “ytd”, “max”
- interval: data interval (1m data is only for available for last 7 days, and data interval <1d for the last 60 days) Valid intervals are:
- “1m”, “2m”, “5m”, “15m”, “30m”, “60m”, “90m”, “1h”, “1d”, “5d”, “1wk”, “1mo”, “3mo”
- start: If not using period – in the format (yyyy-mm-dd) or datetime.
- end: If not using period – in the format (yyyy-mm-dd) or datetime.
- prepost: Include Pre and Post regular market data in results? (Default is
False
)- no need usually to change this from False - auto_adjust: Adjust all OHLC (Open/High/Low/Close prices) automatically? (Default is
True
)- just leave this always as true and don’t worry about it - actions: Download stock dividends and stock splits events? (Default is
True
)
That might look a little complex but mainly you will just be changing the period (or start and end) and interval parameters.
So as an example, to get 1minute historical data for Apple between 02/06/2020 and 07/06/2020 (British format) we just use the ticker object we created and run:
aapl_historical = aapl.history(start="2020-06-02", end="2020-06-07", interval="1m")
aapl_historical
It’s as simple as that!
Demo with multiple tickers
To download the historical data for multiple tickers at once you can use the download module.
It takes mostly the same arguments as the history() method on a ticker object, but additionally:
- group_by: group by column or ticker (‘column’/’ticker’, default is ‘column’)
- threads: use threads for mass downloading? (True/False/Integer)
- proxy: proxy URL if you want to use a proxy server for downloading the data (optional, default is None)
For example to get the data for Amazon, Apple and Google all at once we can run:
data = yf.download("AMZN AAPL GOOG", start="2017-01-01", end="2017-04-30")
data
Note that the default with no interval specified is daily data.
Then, if we want to group by ticker instead of Open/High/Low/Close we can do:
data = yf.download("AMZN AAPL GOOG", start="2017-01-01",
end="2017-04-30", group_by='tickers')
data
How do I download fundamental data using the yfinance library?
Price to Earnings Ratio
You can get the price to earnings ratio with the Ticker.info() method.
Ticker.info() returns a dictionary with a wide range of information about a ticker, including such things as a summary description, employee count, marketcap, volume, P/E ratios, dividends etc.- we recommend taking a look at it yourself as it takes a lot of space to show, but in short if you can’t find the information you’re looking for with the other methods, try the info() method!
To get specifically the price to earnings ratio search the dictionary for ‘forwardPE’:
aapl = yf.Ticker("aapl")
aapl.info['forwardPE']
22.799461
Dividends
You can get the yearly dividend % also by using info():
aapl.info['dividendRate']
3.2800000000000002
And if you want a breakdown of each dividend payout as it occurred and on what date, you can use Ticker.dividends():
aapl.dividends
Date
1987-05-11 0.00214
1987-08-10 0.00214
1987-11-17 0.00286
1988-02-12 0.00286
1988-05-16 0.00286
...
2019-05-10 0.77000
2019-08-09 0.77000
2019-11-07 0.77000
2020-02-07 0.77000
2020-05-08 0.82000
Name: Dividends, Length: 67, dtype: float64
Fundamentals data with multiple tickers at once
We might also want to grab fundamentals (or other) data for a bunch of tickers at once.
Lets have a go at doing that and then try comparing our tickers by a particular attribute!
To do this we can start by creating a list of the tickers we want to get data for, and an empty dictionary to store all the data.
We will need to use the pandas library to manipulate the data frames:
import pandas as pd
tickers_list = ["aapl", "goog", "amzn", "BAC", "BA"] # example list
tickers_data= {} # empty dictionary
We then loop through the list of the tickers, in each case adding to our dictionary a key, value pair where the key is the ticker and the value the dataframe returned by the info() method for that ticker:
for ticker in tickers_list:
ticker_object = yf.Ticker(ticker)
#convert info() output from dictionary to dataframe
temp = pd.DataFrame.from_dict(ticker_object.info, orient="index")
temp.reset_index(inplace=True)
temp.columns = ["Attribute", "Recent"]
# add (ticker, dataframe) to main dictionary
tickers_data[ticker] = temp
tickers_data
We then combine this dictionary of dataframes into a single dataframe:
combined_data = pd.concat(tickers_data)
combined_data = combined_data.reset_index()
combined_data
And then delete the unnecessary “level_1” column and clean up the column names:
del combined_data["level_1"] # clean up unnecessary column
combined_data.columns = ["Ticker", "Attribute", "Recent"] # update column names
combined_data
Great, so we now know how to get any data we want for multiple tickers at once into the same dataframe!
But how do we easily compare by a particular attribute?
Comparing by a particular attribute
It’s quite easy actually, lets try for one of the attributes in info()– the fullTimeEmployees count:
employees = combined_data[combined_data["Attribute"]=="fullTimeEmployees"].reset_index()
del employees["index"] # clean up unnecessary column
employees
So now we have a dataframe of just the employee counts- one entry per ticker- and we can now order by the ‘Recent’ column:
employees_sorted = employees.sort_values('Recent',ascending=False)
employees_sorted
Boom! Obviously not that required with only 5 tickers in our list, but a fantastically easy and powerful way to quickly compare by a particular attribute if we had the ticker list of an entire market!
You can easily use this exact same method to compare any attribute you want!
How do I download trading data using the yfinance library?
You can find the data for all three of Market Cap, Volume and Highs and Lows from the info() method.
Market Cap
To get the market cap, use:
aapl.info["marketCap"]
1525510701056
Volume
To find the current volume do:
aapl.info["volume"]
8021292
If you want the average volume over the last 24 hours do:
aapl.info["averageVolume"]
42532806
And finally if you want the average volume over the last 10 days:
aapl.info["averageVolume10days"]
39594100
Highs and Lows
Remember, you can find the highs and lows for any time interval:
- “1m”, “2m”, “5m”, “15m”, “30m”, “60m”, “90m”, “1h”, “1d”, “5d”, “1wk”, “1mo”, “3mo”
within a desired period by using the history() method and adjusting the interval.
For example, to get the weekly highs and lows for all the historical data that exists, use:
aapl_historical = aapl.history(period="max", interval="1wk")
aapl_historical
Wow, almost 40 years of data!
Just filter the dataframe with:
- aapl_historical[“High”]
- aapl_historical[“Low”]
And so forth to get the individual columns.
Alternatively, you can use info() to get the following useful high/low information:
- dayHigh
- dayLow
- fiftyTwoWeekHigh
- fiftyTwoWeekLow
For example:
aapl.info["fiftyTwoWeekHigh"]
354.77
How do I download options data using the yfinance library?
Briefly, options are contracts giving a trader the right, but not the obligation, to buy (call) or sell (put) the underlying asset they represent at a specific price on or before a certain date.
To download options data we can use the option_chain() method. It takes the parameter as input:
- date: (YYYY-MM-DD), expiry date. If None return all options data.
And has the opt.calls and opt.puts methods.
How do I get Expiration dates?
To get the various expiry dates for options for a particular ticker it’s as easy as:
aapl.options
How do I get Calls Data?
To get the calls data, we can do:
# get option chain calls data for specific expiration date
opt = aapl.option_chain(date='2020-07-24')
opt.calls
How do I get Puts Data?
To get puts data, we do:
opt.puts
Finally, opts by itself returns a ticker object containing both the calls and puts data together, if that’s useful to you!
Common errors
As we highlighted near the beginning of this article, yfinance is an unofficial scraping solution to gather data from Yahoo Finance, so is subject to breaking if Yahoo Finance changes any of its layout.
Unfortunately this already seems to have happened in part, with the following problems discovered when writing this guide:
- Tickers, the multiple tickers object for interacting with multiple tickers at once, doesn’t seem to work. We have provided a more manual workaround for this in the Fundamentals data with multiple tickers at once section.
- The financials, quarterly_financials, balance_sheet, quarterly_balance_sheet, cashflow, quarterly_cashflow, earnings, quarterly_earnings Ticker methods do not work and return empty dataframes.
This is a big problem as in many cases there is no alternative way to the data in some of these methods from other methods in yfinance.
If you are building something that requires any of this data, for example balance sheets and income and cashflow statements and still want free access to the Yahoo Finance data, check out the yahoo_fin library in the examples section of our guide https://algotrading101.com/learn/yahoo-finance-api/ which has working methods to get all of this data!
Final thoughts
So clearly as we have just demonstrated, yfinance is NOT a safe bet to build critical infrastructure on.
If you want to build algorithms trading real money, we absolutely recommend you use an official data source/API, preferably one connected directly to exchange data and with low latency. Something like Polygon.io or IEX might suit you better.
If you absolutely HAVE to use the Yahoo Finance data specifically, we recommend at least paying for an unofficial API like RapidAPI, where you stand a good bet there is an active team of developers constantly maintaining the API. Remember RapidAPI does still have a limited usage free tier!
That said, yfinance can be good to use to build test applications as a beginner, as the sections of it that do work are fantastically easy to get started with and use.
A particular forte of yfinance is that the threads parameter of yf.download does allow very rapid downloading of historical for multiple tickers when set to True!
Link to download code used
You can find the code used in this article here.