Ideally all financial datbases would uniformly offer all company identifiers
CUSIP exists in three varieties
CUSIP 6 = Company (issuer)
CUSIP 8 = CUSIP 6 + issue (a CUSIP 8 always refers to one CUSIP 6, but CUSIP 6 can have more than one CUSIP 8)
CUSIP 9 = CUSIP 8 + check digit
Not all databases use the same CUSIP.
In WHARTON
Summarizing:
Compustat NA uses CUSIP 9
CRSP uses CUSIP 8
CRSP/Compustat merged uses both
CRSP deals only with stock markets NYSE, AMEX and NASDAQ
Take note of the other company IDENTIFIERS!
As for using Tickers instead of CUSIP.
Tickers are a good identifier in environments that are up to date and don't cover historic tickers.
NASDAQ companies got 4 character ticker codes
CIK: Company Index Key (10 digits)
Good instrument but its uses are limited (CRSP and Audit Analytics)Tickers are a good identifier in environments that are up to date and don't cover historic tickers.
NASDAQ companies got 4 character ticker codes
CIK: Company Index Key (10 digits)
GVKey is a Wharton generated code that only works within the Wharton environment
REMOVING OR ADDING THE 9TH DIGIT?
As for removing the ninth digit from CUSIP:
Open your Tab delimited TXT file (using the wizard) in Excel
Create a new column ; select first cell in that column
Click formula button in the command rule ƒx
Type in LEFT (enter) and select from options
A pop up appears.
Click in the first cell of the CUSIP column
Next rule: type in number 8 saying you only want to keep the first 8 digits
Drag down result of double click the + in right lower corner of the cell
Note: I am certain there are more catches lurking to meddle your results.
I guess it is recommended to download as many identifiers and information on the issues as you can find.
Don't forget: the number of available stock exchanges in CRSP is limited (NASDAQ, NYSE, AMEX). Compustat might cover more. Download stock exchange code as well.
TIP: it is easier to start off with CRSP, than haviung to determine which stocks are from the CRSP stock exchanges and which aren't. Use the CRSP/Compustat to make the conversion
Adding the 9th digit to a CUSIP 8
Since the ninth digit is a random figure it is hard to find out which makes a proper CUSIP 9
Extra help is offered through the VU libguide
"Many databases that have information on North America include CUSIP codes, but there are different versions. Some databases include only include issuer level CUSIP-6 codes. Other databases include the CUSIP-8 digit or full CUSIP(CUSIP-9). If you need to use the full cusip code for research and only have a list of Cusip-8 codes, you can use the CUSIP tool to create a text file with the calculated CUSIP-9 codes. The tool was created by Paul Barendregt using Python."
Completing company identifiers (CUSIP)
What if you got company identifiers, and they lost the zeros preceeding the code? (e.g. with opening an excel file containing CUSIP codes)
Example.
Removed zeros.
Checking how many numbers you need for CUSIP is to count the longest CUSIP codes. Or check graph above. In this case it is 9.
Since CUSIP consist of
company issue check digit
123456 78 9
we can conclude that the most right digit in the codes are the check digits.
In other words, we need to add zeros.
In order to retrieve 9 digit CUSIP codes we can add the missing zeros with formula
CUSIP9
=REPT("0";9-LEN(cell nr))&cell nr
no space after ;
GVKEY:
=REPT("0";6-LEN(cell nr))&cell nr
LEN stands for length
no space after ;
IMPORTANT SETTINGS for the formula
Control Panel, language/ region settings: Check under the TAB language > additional settings
List separator ;
or else the formula won't work
Additional additional settings (for Europeans)
Decimal symbol ,
Digit group symbol (thousands and such) .
Close with Apply and OK
The following image shows codes without and with added zeros, plus the formula
RUNNING INTO PROBLEMS?
Recall I mentioned the output conflicting with Excel?
That happens to some people, depending on the computer they are using.
I get it too ...
To be precise: you'll see a little green triangle (1) in the top left of your sells > there is a hidden apostrophe (2) ... and that needs to be removed . . .
(1) (2)
How to solve this:
1. make a selection of all codes
2. Copy to Notepad
3. Select all and copy with CTRL-C
4. Paste as number !!
5. It should be OK now.
6. This also (partially) happen to your CUSIP codes.
7. Repeat the same action
8. Note: I ran into another problem > the preceding zeros were removed in all cases where the green triangle happened
9. Use the REPT formula to repair this
ANOTHER WORK AROUND (EASIEST)
Maybe the easiest solution is the following:
Left in the cell there is a hidden apostrophe.
Removing this and thus enabling formulas.
Method:
1. select the column you want to repair
2. go to format cells (right click)
3. select general
next steps:
1. insert a new column > select the first cell
2. Press function button ƒx
3. search for function RIGHT (Rechts)
4. Start up and select first cell of the column you want to repair
5. Depending of the number of characters you want to keep: tell the actual number. So, CUSIP6 number is 6; CUSIP 8 number is 8 etc.
ADDING to make a CUSIP 6 a CUSIP 8 or 9
It works but entirely depends on the type of CUSIP
Press TAB RESEARCH
The options are somewhat similar to the old interface (which remains posted below the update)
Remove first two characters in a spreadsheet. E.g.: Excel formula ( ƒx) Right number of characters 8, result is CUSIP 8. Copy this into a txt file (note pad) and save.
Next follow menu on WRDS page > choice is convert CUSIP 8 into 9
9
Upload file.
Export as Tab delimited TXT file.
In short steps:
1. go to Wharton TAB Research (top bar)
2. Select CUSIP Converter (right most column)
3. Upload CUSIP 9
4. Determine desired output (CUSIP 9)
5. Determine desired output format (XLS, of TXT)
6. Execute
7. Right click save as...
8. Open in Excel
9. Excel file: repair with REPT("0";9-LEN(CELL NR))&CELL NR
10. type U in first cell next to column
11. Double click + in bottom corner
12. First cell in column next
13. fx formula: concatenate: set prefs
14. double click
15: check: go to Datastream and upload and use the cusip