Text mining for financial and non-financial information from SEC filings, and textual analysis for predictive models and risk assessment

Srivastava, Rajendra P.

IIMA Institutional Repository

This Institutional Repository has been created to collect, preserve and distribute the scholarly output of Indian Institute of Management, Ahmedabad. This will work as an important tool to facilitate scholarly communication and preserve the institution knowledge. The Vikram Sarabhai Library is proud to be hosting the repository for the dissemination and preservation of this valuable knowledge resource of the IIMA community.

Please use this identifier to cite or link to this item: http://hdl.handle.net/11718/22932

Title:	Text mining for financial and non-financial information from SEC filings, and textual analysis for predictive models and risk assessment
Authors:	Srivastava, Rajendra P.
Keywords:	Text Mining;SEC Filings;Textual Analysis;Risk Assessment;Predictive Models
Issue Date:	17-Sep-2019
Publisher:	Indian Institute of Management Ahmedabad
Abstract:	The recent developments in information technology in terms of text mining tools and search engines like Google is not only changing the way we gather information and conduct research, but also changing the kinds of questions researchers are asking and answering. Along with these technological developments, individual researchers’ programming skill is also making a difference in terms of how and what kinds of information one can gather. Textual analysis is becoming more popular in accounting and financial research. For example, Loughran and Mcdonald (2011) performed textual analysis of a large sample of annual reports (10-Ks) of US public companies for 1994-2008, and demonstrated a link between their word lists under various categories (positive, uncertain, litigious, strong modal and weak modal) to 10-K filing returns, trading volume, subsequent return volatility, fraud, material weakness, and unexpected earnings. Recently, Liu and Moffitt (2016) conducted a textual analysis of SEC Comments Letters and developed a measure of intensity based on the modality of comment letters and observed that the intensity of comment letters is positively associated with the probability of a restatement of the reviewed 10-K filings. Moreover, textual analysis and text mining techniques provide information about companies’ performance that is not available otherwise. Elaborating the value of textual analysis, Li (2010, p. 144) states “As a communication vehicle for management, textual disclosures can provide a means for researchers to assess managers’ behavioral biases and understand firm behavior.” Tetlock, Saar-Tsechansky, and Macskassy (2008) examine the use of a simple quantitative measure of language to predict individual firms’ accounting earnings and stock returns. Lee, Churyk and Clinton (2013) develop a fraud detection model based on textual analysis. They state that “Conventional fraud detection measures using ratio analysis and other financial data were either unable to detect the fraud or unable to detect it soon enough to avoid catastrophic outcomes”. Li, Lundholm, and Minnis (2013) develop a model of management's perception of the intensity of competition using textual analysis of firms’ 10-K filings. Developing individual expertise in programming, especially in Perl and Python is time consuming and, in fact, it is a waste of time and resources. An intelligent search engine, SeekiNF (https://www.seekedgar.com), a Cloud based technology, developed at The University of Kansas, provides incredible set of tools to gather financial and non- financial information from SEC filings, perform textual analysis with its built-in features, and develop analytical predictive models for assessing risks such as financial risk, litigation risk, fraud risk, etc. Currently, SeekiNF provides access to 17 million US SEC filings and 33 million documents and provides searched information in a matter of seconds using the Cloud technology. The presentation will focus on a live demonstration of the features of SeekiNF for gathering information by querying the system.
Description:	Text Mining for Financial and Non-Financial Information from SEC Filings, and Textual Analysis for Predictive Models and Risk Assessment by Dr. Rajendra P. Srivastava, Emeritus Professor of the University of Kansas on Tuesday, September 17, 2019
URI:	http://hdl.handle.net/11718/22932
Appears in Collections:	R & P Seminar

Files in This Item:

File	Description	Size	Format
RP_Sep_17_2019.html	RP_Sep_17_2019	1.2 kB	HTML	View/Open

Show full item record