StockData

The database contains over 300,000 symbols

These symbols include Equities, ETFs, Funds, Indices, Currencies, Cryptocurrencies, and Money Markets.

The FinanceDatabase serves the purpose of categorizing any type of financial product for anyone completely free of charge. In order to accomplish this, the FinanceDatabase depends on community participation to add, edit, and remove tickers over time. Because CSV files can be easily edited manually, this is made simple enough that anyone, even those without coding experience, can contribute.

The sheer volume of information available online can be overwhelming for a private investor. There are millions of companies and derivatives available on the market, making it extremely difficult to try to understand what kind of companies or ETFs are available. Yes, it is easy to locate the most popular businesses and exchange-traded funds (ETFs) simply because people are familiar with them (such as Microsoft, Tesla, the SandP500 ETF, or an All-World ETF). But it's not always clear what else is out there.

In an effort to address that, this database. Equities, ETFs, Funds, Indices, Currencies, Cryptocurrencies, and Money Markets are represented by its 300,000,000 symbols. As a result, it enables you to get a comprehensive overview of different sectors, industries, investment types, and much more.

Since current fundamentals and stock information can easily be obtained using yfinance or Fundamental Analysis, the explicit goal of this database is not to provide them. Instead, it provides the most crucial details about each product as well as insights into the products that are available in each industry, sector, and country. With this knowledge, you can analyze certain aspects of the financial world and/or locate a difficult-to-find product. See the section Examples for examples of how to combine this database with the aforementioned packages.

Some key statistics of the database:

ProductQuantitySectorIndustriesCountriesExchanges
Equities155.7051624211182
EFTs36.727364*94*100*52
Funds57.8161678*438*100*34
ProductQuantityCategory
Currencies2.590174 Currencies
Cryptocurrencies3.624299 Cryptocurrencies
Indices83.35349 Exchanges
Money Markets1.3842 Exchanges

*These numbers refer to families (iShares, Vanguard) and categories (World Stock, Real Estate) respectively.

** This is an estimation. Obtaining the country distribution can only be done by collecting data on the underlying or by manual search.


The open source investment firm OpenBB, which is democratizing access to investment research, also uses this database. This enables users to efficiently query the FinanceDatabase to obtain data and symbols from a variety of asset classes that they wish to use further inside the OpenBB Terminal and OpenBB SDK. Browse this GitHub repository for more details about OpenBB.

Please be aware that while I am associated with this business, I do not profit from sharing this information; rather, I genuinely think it is a fantastic piece of (free) software that works well with the FinanceDatabase.


Table of contents

Installation

The Package financedatabase allows you to select specific CSV files as well as search through collected data with a specific query.

You can install the package with the following steps:

1- pip install financedatabase

2- (within python) import financedatabase as fd

Basic Usage

This section explains in detail how the database can be queried with the related financedatabase package, also see the Jupyter Notebook in which you can run the examples also demonstrated here. You can find this document here.

Quick Start

Same methods apply to all other asset classes as well. Columns may vary.

import financedatabase as fd

Scroll down below for a more elaborate explanation and detailed examples.

Collecting Information from the Database

Please see the Jupyter Notebook for an elaborate explanation of each asset class. This includes Equities, ETFs, Funds, Indices, Currencies, Cryptocurrencies and Money Markets.


Find code examples of all Asset Classes in the Jupyter Notebook here.


As an example for Equities, If you wish to collect data from all equities you can use the following:

import financedatabase as fd

# Initialize the Equities database
equities = fd.Equities()

# Obtain all data available excluding international exchanges
equities.select()

Which returns the following DataFrame:

SymbolNameCurrencySectorIndustry-groupIndustryExchangeMarketCountyStateCityZipcodeWebsiteMarket-Cap
AAgilent Technologies, IncUSDHealthcarePharmaceuticals, Biotechnology & Life SciencesBiotechnologyNYQus-marketUnited StatesCASanta Clara95051http://www.agilent.comLarge cap
AAAlcoa CorporationUSDMaterialsMaterialsMetas & MiningNYQus-marketUnited StatesPAPittsburgh152125858http://www.alcoa.comMid cap
AAALFAareal Bank AGUSDFinancialsBanksBanksPNKus-marketGermanyNaNWiesbaden65189http://www.aareal-bank.comSmall cap
AAALYAareal Bank AGUSDFinancialsBanksBanksPNKus-marketGermanyNaNWiesbaden65189aareal-bank.comSmall cap
AABBAsia Broadband, Inc.USDMaterialsMaterialsMetals & MiningPNKus-marketUnited StatesNVLas Vegas89135http://www.asiabroadbandinc.comMicro cap

This returns approximately 20.000 different equities. Note that by default, only the American exchanges are selected. These are symbols like TSLA (Tesla) and MSFT (Microsoft) that tend to be recognized by a majority of data providers and therefore is the default. To disable this, you can set the exclude_exchanges argument to False which then results in approximately 155.000 different symbols.

Note that the summary column is taken out on purpose to keep it organized for markdown. The summary is however very handy when it comes to querying specific words as found with the following description given for Apple. All of this information is available when you query the database

Apple Inc. designs, manufactures, and markets smartphones, personal computers, tablets, wearables, and accessories worldwide. It also sells various related services. The company offers iPhone, a line of smartphones; Mac, a line of personal computers; iPad, a line of multi-purpose tablets; and wearables, home, and accessories comprising AirPods, Apple TV, Apple Watch, Beats products, HomePod, iPod touch, and other Apple-branded and third-party accessories. It also provides AppleCare support services; cloud services store services; and operates various platforms, including the App Store, that allow customers to discover and download applications and digital content, such as books, music, video, games, and podcasts. In addition, the company offers various services, such as Apple Arcade, a game subscription service; Apple Music, which offers users a curated listening experience with on-demand radio stations; Apple News+, a subscription news and magazine service; Apple TV+, which offers exclusive original content; Apple Card, a co-branded credit card; and Apple Pay, a cashless payment service, as well as licenses its intellectual property. The company serves consumers, and small and mid-sized businesses; and the education, enterprise, and government markets. It sells and delivers third-party applications for its products through the App Store. The company also sells its products through its retail and online stores, and direct sales force; and third-party cellular network carriers, wholesalers, retailers, and resellers. Apple Inc. was founded in 1977 and is headquartered in Cupertino, California.

Find more elaborate explanation with help(equities.select):

Help on method select in module financedatabase.equities:

As an example, we can use equities.options to obtain specific country, sector and industry options. For we can acquire all industries within the sector Basic Materials within the United States. This allows us to look at a specific industry in the United States in detail.

industry_options = equities.options(selection='industry', country="United States", sector="Materials")

So with this information in hand, I can now query the industry Metals & Mining as follows:

metals_and_mining_companies_usa = equities.select(country="United States", sector="Materials", industry="Metals & Mining")

This gives you a DataFrame like the previous table.

As you can imagine, looking at such a specific selection only yields a few results but picking the entire sector Materials would have returned 403 different companies (which excludes exchanges other than the United States).

Searching the database extensively

All asset classes have the capability to search each column with search, for example equities.search(). Through how this functionality is developed you can define multiple columns and search throughout. For example:

#Collect all Equities Database
equities = fd.Equities()

# Search Multiple Columns
equities.search(summary='automotive', currency='USD', country='Germany')

Which returns a selection of the DataFrame that matches all criteria.

SymbolNameCurrencySectorindustry_groupindustryexchangemarketCountryStateCityZipcodeWebsiteMarket-Cap
AFRMFAlphaform AGUSDIndustrialCapital GoodsMachineryPNKus-marketGermanyNaNFeldkirchen85622NaNNano Cap
AUUMFAumann AGUSDIndustrialCapital GoodsMachineryPNKus-marketGermanyNaNBeelen48361http://www.aumann.comMicro Cap
BAMXFBayerische Motoren Werke AktiengesellschaftUSDConsumer DiscretionaryAutomobiles & ComponentsAutomobilesPNKus-marketGermanyNaNMunich80788http://www.bmwgroup.comLarge Cap
BASFYBASF SEUSDMaterialsMaterialsChemicalsPNKus-marketGermanyNaNLudwigshafen am Rhein67056http://www.basf.comLarge Cap
BDRFFBeiersdorf AktiengesellschaftUSDConsumer staplesHousehold & Personal ProductsHousehold ProductsPNKus-marketGermany

Storing the database at a different location

If you wish to store the database at a different location (for example your own Fork) you can do so with the variable base_url which you can find in each of the asset classes. An example would be:

  • fd.Equities(base_url=<YOUR URL>)

You can also store the database locally and point to your local location with the variable base_url and by setting use_local_location to True. An example would be:

  • fd.Equities(base_url=<YOUR PATH>, use_local_location=True)

Examples

This section gives a few examples of the possibilities with this package. These are merely a few of the things you can do with the package. As you can obtain a wide range of symbols, pretty much any package that requires symbols should work.

Companies in the Netherlands

I want to see how many companies exist in each sector in the Netherlands. Let's count all companies with the following code, I skip a sector when it has no data and also do not include companies that are not categorized:

import financedatabase as fd

equities = fd.Equities()

equities_per_sector_netherlands = {}

for sector in equities.options(selection='sector', country='Netherlands'):
    try:
        equities_per_sector_netherlands[sector] = len(equities.select(country='Netherlands', sector=sector))
    except ValueError as error:
        print(error)

And at last I plot the data in a pie chart and add some formatting to make the pie chart look a bit nicer:

import matplotlib.pyplot as plt

legend, values = zip(*equities_per_sector_netherlands.items())

colors = ['b', 'g', 'r', 'c', 'm', 'y', 'k', 'tab:blue', 'tab:orange', 'tab:gray',
          'lightcoral', 'yellow', 'saddlebrown', 'lightblue', 'olive']
plt.pie(values, labels=legend, colors=colors,
        wedgeprops={'linewidth': 0.5, 'edgecolor': 'white'})
plt.title('Companies per sector in the Netherlands')
plt.tight_layout()

plt.show()

This results in the following graph which gives an indication which sectors are dominant within The Netherlands. Of course this is a mere example and to truly understand the importance of certain companies for the Netherlands, you would need to know market cap of each sector as well including demographics.

Technical Analysis of Biotech ETFs

With the help of ta and yfinance I can quickly perform a basic technical analysis on a group of ETFs categorized by the FinanceDatabase. I start by searching the database for ETFs related to Health and then make a subselection by searching, in the collected database, for biotech-related ETFs:

import financedatabase as fd

etfs = fd.ETFs()

health_care_etfs_in_biotech = etfs.search(category='Health Care', summary='biotech')

Then, I collect stock data on each ticker and remove tickers that have no data in my chosen period. The period I have chosen shows the initial impact of the Coronacrisis on the financial markets.

import yfinance as yf

tickers = list(health_care_etfs_in_biotech.index)

stock_data_biotech = yf.download(tickers, start="2020-01-01", end="2020-06-01")['Adj Close']
stock_data_biotech = stock_data_biotech.dropna(axis='columns')

Next up I initialise subplots and loop over all collected tickers. Here, I create a new temporary DataFrame that I fill with the adjusted close prices of the ticker as well as the Bollinger Bands. Then I plot the data in one of the subplots.

import pandas as pd
from ta.volatility import BollingerBands
import matplotlib.pyplot as plt

figure, axis = plt.subplots(4, 3)
row = 0
column = 0

for ticker in stock_data_biotech.columns:
    data_plot = pd.DataFrame(stock_data_biotech[ticker])
    name = health_care_etfs_in_biotech.loc[health_care_etfs_in_biotech.index == ticker, 'name'].iloc[0]

    indicator_bb = BollingerBands(close=stock_data_biotech[ticker], window=20, window_dev=2)

    data_plot['bb_bbm'] = indicator_bb.bollinger_mavg()
    data_plot['bb_bbh'] = indicator_bb.bollinger_hband()
    data_plot['bb_bbl'] = indicator_bb.bollinger_lband()

    axis[row, column].plot(data_plot)
    axis[row, column].set_title(name, fontsize=6)
    axis[row, column].set_xticks([])
    axis[row, column].set_yticks([])

    column += 1
    if column == 3:
        row += 1
        column = 0

figure.suptitle('Technical Analysis of Biotech ETFs during Coronacrisis')
figure.tight_layout()

This leads to the following graph which gives an indication whether Biotech ETFs were oversold or overbought and how this effect is neutralised (to some degree) in the months after. Read more about Bollinger Bands here.

Silicon Valley's Market Cap

If I want to understand which listed technology companies exist in Silicon Valley, I can collect all equities of the sector 'Technology' and then filter based on city to obtain all listed technology companies in 'Silicon Valley'. The city 'San Jose' is where Silicon Valley is located.

import financedatabase as fd

equities = fd.Equities()

silicon_valley = equities.search(sector='Technology', city='San Jose')

Then I start collecting data with the FundamentalAnalysis package. Here I collect the key metrics which include 57 different metrics (ranging from PE ratios to Market Cap).

import fundamentalanalysis as fa

Then I make a selection based on the last 5 years and filter by market cap to compare the companies in terms of size with each other. This also causes companies that have not been listed for 5 years to be filtered out of my dataset. Lastly, I plot the data.

import fundamentalanalysis as fa

This results in the graph displayed below which separates the small companies from the large companies. Note that this does not include all technology companies in Silicon Valley because most are not listed or are not included in the database of the FundamentalAnalysis package.

Questions & Answers

In this section you can find answers to commonly asked questions. In case the answer to your question is not here, consider creating an Issue.

  • How is the data obtained?

    • The data is an aggregation of a variety of sources and is mostly a curation of myself. Next to that, it is driven by the community to extend further.
  • What categorization method is used?

    • The categorization for Equities is based on a loose approximation of GICS. No actual data is collected from this source and this database merely tries to reflect the sectors and industries as best as possible. This is completely done through manual curation. The actual datasets as curated by MSCI has not been used in the development of any part of this database and remains the most up to date, paid, solution. Other categorizations are entirely developed by the author and can freely be changed.
  • How can I contribute?

  • Is there support for my country?

    • Yes, most likely there is as the database includes 112 countries. Please use fd.obtain_options('equities')['country']
  • How can I find out which countries, sectors and/or industries exists within the database without needing to check the database manually?

    • For this you can use the obtain_options function from the package attached to this database. Furthermore, it is also possible to use equities = fd.Equities() and then use equities.options(selection='country') or specific further with equities.options(selection='sector', country='United States'). Please see this example
  • When I try collect data I notice that not all tickers return output, why is that?

    • Some tickers are merely holdings of companies and therefore do not really have any data attached to them. Therefore, it makes sense that not all tickers return data. If you are still in doubt, search the ticker on Google to see if there is really no data available. If you can't find anything about the ticker, consider updating the database by visiting the Contributing Guidelines.
  • How does the database handle changes to companies over time - like symbol/exchange migration, mergers, bankruptcies, or symbols getting reused?

    • These type of service is what you usually pay a hefty fee for, think of Bloomberg at over $25.000 a year. Instead of requiring your to pay, this is meant to be a community-driven project in which you help in identifiyng these companies. As news about migrations, mergers, bankruptcies and similar occur it is up to the community to identify these and/or users to look into writing scripts that help with this. It is important to note that the vast majority of companies do not change as rapidly that this database becomes irrelevant before it is identified, e.g. a company like Facebook changing to META has already been updated. Furthermore, even though a company goes bankrupt, the old ticker is still relevant when it comes to historic data before the bankruptcy.

User Contributions

This section is meant to thank those that contributed to the project. Looking to contribute as well? Have a look here.

Contact

If you have any questions about the FinanceDatabase or would like to share with me what you have been working on, feel free to reach out to me via: