|This dissertation focuses on advancing the machine learning, with a particular focus on the application for financial trading. It is organized into two parts. The first part of this dissertation (Chapters 1-2) will be concerned with the application of predictive modeling on stock market prediction. Chapter 1 presents the basics of machine learning and deep learning. In Chapter 2, we combine several recent advances in deep learning to build a hybrid model to forecast the stock prices, that gives us the ability to learn from various aspects of the related information. In particular, we take a deep look at the representation learning and temporal convolutional network for sequential modeling. With representation learning, we derived an embedding called Stock2Vec, which gives us insight for the relationship among different stocks, while the temporal convolutional layers are used for automatically capturing effective temporal patterns both within and across series. Our hybrid framework integrates both advantages and achieves better performance on the stock price prediction task than several popular benchmarked models.
In the second part of this dissertation (Chapters 3 - 6), we turn our focus to the topics of reinforcement learning. In Chapter 3, we provide the necessary mathematical and theoretical preliminaries in reinforcement learning, as well as several recent advances in deep Q-networks (DQNs) that we would apply later. In Chapters 4 and 5, we aim at algorithmically improving the convergence of training in reinforcement learning, with theoretical analysis and empirical experiments. One prominent challenge in reinforcement learning is the tradeoff between exploration and exploitation. In deep Q-networks (DQNs), this is usu- ally addressed by monotonically decreasing the exploration rate yet is often unsatisfactory. In Chapter 4, we propose to encourage exploration by resetting the exploration rate when it is necessary. Another severe problem in training deep Q-networks involves the overestimation for the Q-values. In Chapter 5, we propose to bootstrap the estimates from multiple agents, and refer to this learning paradigm as cross Q-learning. Our algorithm effectively reduces the overestimation and significantly outperforms the state-of-the-art DQN training algorithms. In Chapter 6, we continue our studies on DQN with an application in real financial trading environment, by training a DQN agent that provides trading strategies. Finally, we summarize this dissertation in Chapter 7, and discuss the possible directions for future research.