Use this URL to cite or link to this record in EThOS:
Title: Machine learning and alternative data analytics for fashion finance
Author: Bainiaksinaite, Julija
ISNI:       0000 0004 9352 6940
Awarding Body: UCL (University College London)
Current Institution: University College London (University of London)
Date of Award: 2020
Availability of Full Text:
Access from EThOS:
Full text unavailable from EThOS. Please try the link below.
Access from Institution:
This dissertation investigates the application of Machine Learning, Natural Language Processing and computational finance to a novel area Fashion Finance. Specifically identifying investment opportunities within the Apparel industry using influential alternative data sources such as Instagram. Fashion investment is challenging due to the ephemeral nature of the industry and the difficulty for investors who lack an understanding of how to analyze trend-driven consumer brands. Unstructured online data (e-commerce stores, social media, online blogs, news, etc.), introduce new opportunities for investment signals extraction. We focus on how trading signals can be generated from the Instagram data and events reported in the news articles. Part of this research work was done in collaboration with Arabesque Asset Management. Farfetch, the online luxury retailer, and Living Bridge Private Equity provided industry advice. Research Datasets The datasets used for this research are collected from various sources and include the following types of data: - Financial data: daily stock prices of 50 U.S. and European Apparel and Footwear equities, daily U.S. Retail Trade and U.S. Consumer Non-Durables sectors indices, Form 10-K reports. - Instagram data: daily Instagram profile followers for 11 fashion companies. - News data: 0.5 mln news articles that mention selected 50 equities. Research Experiments The thesis consists of the below studies: 1. Relationship between Instagram Popularity and Stock Prices. This study investigates a link between the changes in a company's popularity (daily followers counts) on Instagram and its stock price, revenue movements. We use cross-correlation analysis to find whether the signals derived from the followers' data could help to infer a company's future financial performance. Two hypothetical trading strategies are designed to test if the changes in a company's Instagram popularity could improve the returns. To test the hypotheses, Wilcoxon signed-rank test is used. 2. Dynamic Density-based News Clustering. The aim of this study is twofold: 1) analyse the characteristics of relevant news event articles and how they differ from the noisy/irrelevant news; 2) using the insights, design an unsupervised framework that clusters news articles and identifies events clusters without predefined parameters or expert knowledge. The framework incorporates the density-based clustering algorithm DBSCAN where the clustering parameters are selected dynamically with Gaussian Mixture Model and by maximizing the inter-cluster Information Entropy. 3. ALGA: Automatic Logic Gate Annotator for Event Detection. We design a news classification model for detecting fashion events that are likely to impact a company's stock price. The articles are represented by the following text embeddings: TF-IDF, Doc2Vec and BERT (Transformer Neural Network). The study is comprised of two parts: 1) we design a domain-specific automatic news labelling framework ALGA. The framework incorporates topic extraction (Latent Dirichlet Allocation) and clustering (DBSCAN) algorithms in addition to other filters to annotate the dataset; 2) using the labelled dataset, we train Logistic Regression classifier for identifying financially relevant news. The model shows the state-of-the-art results in the domain-specific financial event detection problem. Contribution to Science This research work presents the following contributions to science: - Introducing original work in Machine Learning and Natural Language Processing application for analysing alternative data on ephemeral fashion assets. - Introducing the new metrics to measure and track a fashion brand's popularity for investment decision making. - Design of the dynamic news events clustering framework that finds events clusters of various sizes in the news articles without predefined parameters. - Present the original Automatic Logic Gate Annotator framework (ALGA) for automatic labelling of news articles for the financial event detection task. - Design of the Apparel and Footwear news events classifier using the datasets generated by the ALGA's framework and show the state-of-the-art performance in a domain-specific financial event detection task. - Build the 'Fashion Finance Dictionary' that contains 320 phrases related to various financially-relevant events in the Apparel and Footwear industry.
Supervisor: Not available Sponsor: Not available
Qualification Name: Thesis (Ph.D.) Qualification Level: Doctoral
EThOS ID:  DOI: Not available