Programming Languages – High-Level Code & Machine Code

KS3 Computer Science

11-14 Years Old

48 modules covering EVERY Computer Science topic needed for KS3 level.

GCSE Computer Science

14-16 Years Old

45 modules covering EVERY Computer Science topic needed for GCSE level.

A-Level Computer Science

16-18 Years Old

66 modules covering EVERY Computer Science topic needed for A-Level.

KS3 Programming Languages (14-16 years)

  • An editable PowerPoint lesson presentation
  • Editable revision handouts
  • A glossary which covers the key terminologies of the module
  • Topic mindmaps for visualising the key concepts
  • Printable flashcards to help students engage active recall and confidence-based repetition
  • A quiz with accompanying answer key to test knowledge and understanding of the module

A-Level Procedural and Object-oriented Languages (16-18 years)

  • An editable PowerPoint lesson presentation
  • Editable revision handouts
  • A glossary which covers the key terminologies of the module
  • Topic mindmaps for visualising the key concepts
  • Printable flashcards to help students engage active recall and confidence-based repetition
  • A quiz with accompanying answer key to test knowledge and understanding of the module

Candidates should be able to:

  • explain the difference between high level code and machine code
  • explain the need for translators to convert high level code to machine code
  • describe the characteristics of an assembler, a compiler and an interpreter
  • describe common tools and facilities available in an integrated development environment (IDE): editors, error diagnostics, run-time environment, translators, auto-documentation.

What is machine code?

Machine code is the lowest level of programming language because the instructions are executed directly by a computer’s central processing unit (CPU). It is important to understand that every CPU or CPU family has its own machine code instruction set.

Machine code instructions are simply numbers stored as a binary pattern of bits i.e. 1010100101000000.

A machine code instruction is typically made up of two parts:

  • an operator (OP code) which is the instruction part that the CPU executes (carries out). i.e. 10101001 could be an instruction for the CPU to load from memory.
  • an operand (typically a memory address where data is read from or written to, depending on the operator). i.e. 01000000 could be memory address 64.
  • Some OP codes such as the one to END a program do not require an operand.

Writing programs directly in machine code would be tedious and error-prone as all the numerical addresses for branch instructions and data locations would need to be calculated manually. Assembly language was an initial solution to this problem, followed by increasingly sophisticated higher-level programming languages.

Because machine code instructions are the only ones the CPU can execute, the source code for ALL other programming languages must be converted into machine code before it can be executed. This translation is carried out by special programs called compilers, translators or assemblers.

The Little Man Computer (LMC) is a simulator that models a simple Von Neumann architecture computer. It can be programmed in assembly language or machine code and includes a basic guide and some example programs.There are two versions available:

  • The JAVA applet version – an online version recommended by OCR  Most web browsers block JAVA applets now
  • The VB.NET version – based on the above version, it can be downloaded and installed on a Windows computer

What is high-level code and why does it need a translator?

High-level code uses words that are designed to be read by human programmers as well as a computer. Statements written in high-level languages such as Visual Basic, C++, Python, Delphi and Java are therefore understood far more easily than programs written in machine code or assembly language.

High level code is also portable between different computer operating systems.

High level source code will make use of some or all of the following:

  • Keywords – reserved words such as SORT, IF, FUNCTION etc. which are simple to understand and would involve a lot of programming using machine code.
  • Syntax – rules for the use of keywords and the arguments that go with them.
  • More complex iteration and conditional programming structures than just simple branches

The CPU can only execute machine code instructions. This means that code written in any other programming language has to be translated into machine code before it can be executed. This translation is carried out by two types of program:

An interpreter

  • An interpreter allows the programmer to run the source code but only within the interpreter. It does this by translating the source code into the equivalent machine code line-by-line, as the program is running. This makes the program run relatively slowly as each instruction has to be translated before it can be executed and an error will cause the program to stop at that line. However, it is ideal during the development stages as it allows the programmer to quickly test their source code and resume the program once an error is fixed.
  • Advantages of using an interpreter:
    • It is easier to check for errors than with a compiler because the error can easily be traced to the line of source code that generated it.
    • It is faster to develop software because the whole program does not need to be compiled every time something needs to be tested.
  • Disadvantages of using an interpreter:
    • The program cannot be executed without the source code.
    • Because the source code needs to be available and is usually just text, the program to be executed is less secure.
    • An interpreted program will execute more slowly than a compiled program due to the line-by-line translation.

A compiler

  • A compiler is used once the source code has been fully developed and tested using an interpreter. It translates the completed source code into machine code and creates a new file which can be executed by the CPU as a stand-alone program. This translation can involve several stages and may take a considerable amount of time because one source code instruction may translate into hundreds of machine code instructions.
  • Advantages of using a compiler:
    • The source code is not included so compiled machine code is much more secure than interpreted code.
    • It produces an executable file so the program can be run without the source code.
  • Disadvantages of using a compiler:
    • It is a slow process translating the source code into machine code.
    • Even the smallest change means that the whole compilation process has to be done again.

What is assembly language and what is an assembler?

Because each machine code instruction is just made up of numbers stored as a binary bit pattern it is very difficult for humans to read or develop software directly using machine code. Assembly language was the original attempt to solve this problem (followed by increasingly advanced high-level programming languages).

Assembly language is a very simple programming language that uses mnemonics (memory aids) to directly represent machine code instructions. It uses labels to represent the memory addresses of branch destinations and data.

An assembler translates assembly language instructions into machine code instructions.

There is nearly a one-to-one match between assembly language instructions and machine code instructions. This means that a machine code program translated from assembly language is very efficient and therefore tends to require less memory and run faster than a machine code program translated from a high-level language using a compiler.

The table below shows some examples of assembly language mnemonics and the 16-bit machine code instructions produced when they are assembled. Each machine code instruction is made up of an operator (op code) and an operand.

Assembly language mnemonicMachine codeWhat it does
LDA #641010100101000000Load* the accumulator with the number 64
ADC #640110100101000000Add the number 64 to whatever is in the accumulator
LDA 641010010101000000Load* the accumulator with the data stored in memory address 64
ADC 640110100101000000Add to the accumulator the data stored in memory address 64

*The machine code instruction to load a fixed number into the accumulator is different from the one to load a number from a memory address, even though the mnemonic is the same. In assembly language, the # symbol is used to denote a number rather than a memory address.

Advantages of writing programs in assembly language:

  • It usually creates fast running programs because the one-to-one match means that the machine code program created will tend to be very efficient.
  • The translation into machine code will be very fast due to the one-to-one match between assembly language instructions and machine code instructions.
  • Easier to understand compared to machine code due to the use of mnemonics.
  • Labels can be used to label memory addresses. Without these, adding or removing an instruction means all the memory addresses referred in the program need to be recalculated.

Disadvantages of writing programs in assembly language:

  • Different versions of an assembly language are often required for different processors making it difficult to transfer programs between processors.
  • Assembly language programs are often written for specific hardware which means they are often incompatible with different hardware.
  • A lot of assembly code is needed to do relatively simple tasks so complex programs require a lot of assembly instructions and it will take a lot time to write the program.

What common tools and facilities are available in an integrated development environment (IDE)?

An integrated development environment (IDE) is a software application that provides a complete set of facilities for a computer programmer to develop a software application in a high-level language, test it and then convert it into a stand-alone machine code application.

Programming Languages: A screenshot of an IDE used to write programs in C++
A screenshot of an IDE used to write programs in C++

An IDE will be specific to a certain high-level language and will normally consist of:

  • an editor – to allow source code to be entered and edited.
  • build automation tools – to automate a wide variety of tasks such as entering source code with the correct syntax, managing variables etc.
  • error diagnostics – to produce error messages that highlight what an error is and where it has occurred (either by highlighting it or indicating the line number of the error).
  • an interpreter – to allow source code to be translated line by line into machine code instructions so they can be executed from within the IDE.
  • a compiler – to translate the completed source code into machine code so it can be executed as a stand-alone program file.
  • a run-time environment to allow the programmer to test the program while it is running. This allows the program to be run in an environment where the programmer can track the instructions and variables being processed by the program and diagnose any errors that might occur. If the program crashes, the run-time environment keeps running and can provide information about why the crash occurred.
  • auto-documentation to create reference manuals in such as text or HTML files by extracting the comments where available from the source code. Such comments can be written by the programmer as they create and modify the source code, making it much easier to keep the documentation up-to-date.
 

Further Readings: