Artificial Intelligence (AI) has become an integral part of our daily lives and recently gained widespread public interest with the impressive introduction of ChatGPT in November 2022. The popular chatbot utilizes the technology of Deep Neural Networks (DNNs), which has become popular among researchers since the network “AlexNet” scored first place in the ImageNet competition in 2012. Despite their outstanding performance, DNNs poses significant technical and environmental challenges. The computational and memory requirements during training and inference are immense, resulting in substantial energy consumption and carbon dioxide emissions. In addition, high-performance computing hardware, such as a GPU, is necessary to achieve a reasonable runtime, preventing DNNs from running directly on mobile or embedded devices. Network quantization is a promising way to solve these challenges. Instead of using 32-bit floating-point values, lower-bit values are used to represent features and weights in the network. The most extreme case, a Binary Neural Network (BNN), uses 1-bit values, which allows for model compression and speedup during inference, albeit at the cost of slightly decreased accuracy. The efficient implementation allows these binary models to run on less-powerful CPUs and preserve energy simultaneously, e.g.., for running on battery-powered devices. This thesis explores various model design principles that enhance the accuracy and efficiency of BNNs and demonstrates their effectiveness on standard benchmark datasets. Before proposing our first model architecture and approach, we examine the importance of shortcut connections in previous BNN architectures and formulate golden rules for designing accurate and compact BNNs. These insights are then used to remove network bottlenecks and construct BinaryDenseNet based on dense shortcut connections. In the second approach, we concentrate on solving both core problems of binary networks: reduced feature capacity and reduced feature quality at the same time. The proposed solution, combining dense blocks with a novel improvement block design, is the basis for building the network MeliusNet. The evaluation of MeliusNet shows it can challenge the accuracy and efficiency of the popular compact neural network MobileNet. Finally, we concentrate on maximizing the energy efficiency of BNNs by reducing the remaining 32-bit values and propose BoolNet in our third approach. The energy consumption is measured through hardware simulations and shows a more efficient result than other works. To promote reproducibility and support future research, the code for all proposed methods in this thesis is published individually or as part of open-source frameworks developed during this work. The technical design of these frameworks is presented briefly in the later parts of this work, together with demo applications. Afterward, we present the related work to the different approaches in this work and conclude with an outlook to the future of BNNs.

