Basic information on financial databases: cook books, tips and tricks & economic news

This blog contains schematic easy to grasp - hands on - help in performing searches in economic databases, making work sets and making them inter-exchangeable between the databases.

* Disclaimer. I am not a finance professional. Most posts are the result of personal findings.

Note:
All presented images are scaled and can be enlarged to original size (click the picture).

Search This Blog

2/06/2018

Concatenate when converting CUSIP

CUSIP exists in three varieties
CUSIP 6 = Company (issuer)

CUSIP 8 = CUSIP 6 + issue (a CUSIP 8 always refers to one CUSIP 6, but CUSIP 6 can have more than one CUSIP 8)

CUSIP 9 = CUSIP 8 + check digit
Many databases need the CUSIP 9, but many databases use other CUSIPS

Conversions are inevitable.

The following Excel formulas could be useful:

Combining cells with CONCATENATE

Concatenate cells (in Dutch samenvoegen)

=CONCATENATE(A1," ", B1)






Dutch Excel

TEKST.SAMENVOEGEN

=TEKST.SAMENVOEGEN(A1,"",B1)






Same effect, other formula:

Given a table with two data columns

=A1&B1

e.g.




The formula is visible after the ƒx symbol and has to be filled in in Cell C1
With a space between the data:

=A1&" "&B1&"




NOTE: if this formula doesn't work because your opened file conflicts with Excel, try to download your output in Excel format and repair your company identifiers afterwards by adding the lost zeros. See this blog item

1/03/2018

CUSIP 6, CUSIP 8, CUSIP 9 and Repairing CUSIP codes

While the world has embraced ISIN USA sticks with CUSIP

Ideally all financial datbases would uniformly offer all company identifiers

CUSIP exists in three varieties

CUSIP 6 = Company (issuer)
CUSIP 8 = CUSIP 6 + issue (a CUSIP 8 always refers to one CUSIP 6, but CUSIP 6 can have more than one CUSIP 8)
CUSIP 9 = CUSIP 8 + check digit

Not all databases use the same CUSIP.

In WHARTON


Summarizing:
Compustat NA uses CUSIP 9
CRSP uses CUSIP 8
CRSP/Compustat merged uses both
CRSP deals only with stock markets NYSE, AMEX and NASDAQ

Take note of the other company IDENTIFIERS!
As for using Tickers instead of CUSIP.
Tickers are a good identifier in environments that are up to date and don't cover historic tickers.
NASDAQ companies got 4 character ticker codes

CIK: Company Index Key  (10 digits)
 Good instrument but its uses are limited (CRSP and Audit Analytics)
GVKey is a Wharton generated code that only works within the Wharton environment


REMOVING OR ADDING THE 9TH DIGIT?

As for removing the ninth digit from CUSIP:
Open your Tab delimited TXT file (using the wizard) in Excel
Create a new column ; select first cell in that column 
Click formula button in the command rule ƒx
Type in LEFT (enter) and select from options
A pop up appears.
Click in the first cell of the CUSIP column
Next rule: type in number 8 saying you only want to keep the first 8 digits
Drag down result of double click the + in right lower corner of the cell

Note: I am certain there are more catches lurking to meddle your results.
I guess it is recommended to download as many identifiers and information on the issues as you can find.

Don't forget: the number of available stock exchanges in CRSP is limited (NASDAQ, NYSE, AMEX). Compustat might cover more. Download stock exchange code as well.

Second: if you want to download from CRSP and venture to data from Fundamentals Annual, keep in  mind that both databases cover different entities.

TIP: it is easier to start off with CRSP, than haviung to determine which stocks are from the CRSP stock exchanges and which aren't. Use the CRSP/Compustat to make the conversion

Adding the 9th digit to a CUSIP 8
Since the ninth digit is a random figure it is hard to find out which makes a proper CUSIP 9
Extra help is offered through the VU libguide

"Many databases that have information on North America include CUSIP codes, but there are different versions. Some databases include only include issuer level CUSIP-6 codes. Other databases include the CUSIP-8 digit or full CUSIP(CUSIP-9). If you need to use the full cusip code for research and only have a list of Cusip-8 codes, you can use the CUSIP tool to create a text file with the calculated CUSIP-9 codes. The tool was created by Paul Barendregt using Python."
Completing company identifiers (CUSIP)
What if you got company identifiers, and they lost the zeros preceeding the code? (e.g. with opening an excel file containing CUSIP codes)
Example.
Removed zeros.



Checking how many numbers you need for CUSIP is to count the longest CUSIP codes. Or check graph above. In this case it is 9.
Since CUSIP consist of
 company    issue    check digit
123456         78            9
we can conclude that the most right digit in the codes are the check digits.
In other words, we need to add zeros.
In order to retrieve 9 digit CUSIP codes we can add the missing zeros with formula

CUSIP9
=REPT("0";9-LEN(cell nr))&cell nr
no space after ;
GVKEY:
=REPT("0";6-LEN(cell nr))&cell nr
LEN stands for length
no space after ;

IMPORTANT SETTINGS for the formula
Control Panel, language/ region settings: Check under the TAB language > additional settings
List separator ;
or else the formula won't work
Additional additional settings (for Europeans)
Decimal symbol ,
Digit group symbol (thousands and such) .

Close with Apply and OK

The following image shows codes without and with added zeros, plus the formula















RUNNING INTO PROBLEMS?

Recall I mentioned the output conflicting with Excel?
That happens to some people, depending on the computer they are using.
I get it too ...

To be precise: you'll see a little green triangle (1) in the top left of your sells > there is a hidden apostrophe (2) ... and that needs to be removed . . .
(1)                                                                             (2)









How to solve this:

1. make a selection of all codes
2. Copy to Notepad
3. Select all and copy with CTRL-C

4. Paste as number !!
5. It should be OK now.
6. This also (partially) happen to your CUSIP codes.
7. Repeat the same action
8. Note: I ran into another problem > the preceding zeros were removed in all cases where the green triangle happened
9. Use the REPT formula to repair this


ANOTHER WORK AROUND (EASIEST)
Maybe the easiest solution is the following:
Left in the cell there is a hidden apostrophe.
Removing this and thus enabling formulas.


Method:
1. select the column you want to repair
2. go to format cells (right click)
3. select general
next steps:
1. insert a new column > select the first cell
2. Press function button ƒx
3. search for function RIGHT (Rechts)
4. Start up and select first cell of the column you want to repair
5. Depending of the number of characters you want to keep: tell the actual number. So, CUSIP6 number is 6; CUSIP 8 number is 8 etc.


ADDING to make a CUSIP 6 a CUSIP 8 or 9 

It works but entirely depends on the type of CUSIP



Press TAB RESEARCH
The options are somewhat similar to the old interface (which remains posted below the update)




Remove first two characters in a spreadsheet. E.g.: Excel formula ( ƒx) Right number of characters 8, result is CUSIP 8. Copy this into a txt file (note pad) and save.
Next follow menu on WRDS page > choice is convert CUSIP 8 into 9

 



 9
Upload file.
Export as Tab delimited TXT file.











In short steps:

1. go to Wharton TAB Research (top bar)
2. Select CUSIP Converter (right most column)
3. Upload CUSIP 9
4. Determine desired output (CUSIP 9)
5. Determine desired output format (XLS, of TXT)
6. Execute
7. Right click save as...
8. Open in Excel
9. Excel file: repair with REPT("0";9-LEN(CELL NR))&CELL NR
10. type U in first cell next to column
11. Double click + in bottom corner 
12. First  cell in column next
13. fx formula: concatenate: set prefs
14. double click
15: check: go to Datastream and upload and use the cusip