AI, Cybersecurity, Software Engineering and Beyond.
Large Language Models (LLMs) are developed to understand the probability distribution that governs the world language space. Autoregressive models approximate this distribution by predicting subsequent words based on previous context, forming a Markov chain. World knowledge (often referred as parametric knowledge) is stored implicitly within the model's parameters.
Pre-reading Requirements In this post, I assume you have a basic background in software and cybersecurity engineering. However, even if you're not highly technical, don't worry, I will ensure that you can grasp and understand the intricacies of the vulnerability and the exploit, as well as
Large Language Models (LLMs), exemplified by OpenAI’s GPT-4 and Meta’s LLaMA, continue to impress us with their capabilities, which have surpassed expectations from just a few years ago. Recently, the research community has shifted its focus towards the optimal and efficient usage of resources. Concepts like the Mixture
DAST and black-box approaches are methods used to test the security of an application by analyzing its behavior in response to inputs without having knowledge of the application’s internal structure or the code being executed. These approaches rely on generating inputs through methods such as brute force and randomness,
Traditional security measures, including firewalls, intrusion detection systems and AVs aim to prevent malicious activities by identifying and blocking known threats before they can cause harm. These security measures frequently employ signature-based detection methods, complemented by heuristic, machine learning and behavior analysis techniques. RASP (in short for Runtime Application Self
Transformer-based deep learning models, such as GPT-3 and LLaMA, have achieved state-of-the-art results on many NLP tasks. These models have exhibited outstanding performance and are capable of resolving tasks on the fly through in-context-learning (ICL) without the need for retraining. This approach helps to avoid the well-known catastrophic forgetting problem.
The urge to train expansive deep learning models, particularly large language models, is ever-growing. A single GPU often falls short in providing the required memory capacity to accommodate various parameters and data, thus necessitating the employment of multiple GPUs. Additionally, the time cost of training complex models can be daunting.
Tests are a crucial part of the software development life cycle (SDLC) and are used to ensure that the software is functioning as intended. There are various types of tests that can be performed, including functional and non-functional tests. Functional testing is used to verify that the software meets its
Scaling up the size of models leads to a considerable augmentation in computational expenses, both during training and inference phases. In a bid to harness the benefits of parameter scaling without an equivalent surge in computational requirements, the Mixture of Experts (MoE) approach was developed for expansive language models. Within