Linux File Identification: Do file extensions matter in Linux?

Share your love

When working with an operating system such as Linux, Windows, or MacOS, the OS must be able to identify file types in some sort of the way.

And with that, the OS can be able to associate the file to the right application or action it should take upon you performing an open action.

With each operating system, it has its own approach to identifying files, where most of the identification is done through:

  1. File extensions
  2. Contents of a file that checks on MIME types and file signatures.

If you are coming from a Windows operating system environment, you know how the OS heavily relies on file extensions to open and execute files.

File type identification that is done on Windows is based on file extensions.

File extensions are set of characters appearing after a period (.) in the name of the file to identify its file type or format. “docx” is an example file extension used for identifying Microsoft Word file format/type.

However, when it comes to Linux, file identification does not rely on file extensions.

File extensions do not matter on Linux because the operating system relies on content-based file identification to determine what type of file it is. Linux checks on MIME types and file signatures/magic bytes that uniquely identify the types of a file.

However, when it comes to Linux users and applications, file extensions help to quickly identify the file type.

Let’s get into details to see how Linux identifies files and how that differs from a user’s or application’s perspective.

Why file extensions do not matter on Linux

Does not use them to determine what type of a file it is: whether executable, storage, or configuration file.

Does not use the extension to determine which program to open a file with

How users and applications recognize files

Linux does not follow the convention that humans do for recognizing filenames. Besides, it does not follow the file identification that applications use to identify file types.

You see,

When users and applications are interacting with a particular file name, they usually rely on the file extension to determine whether it is an image, video, or text file.

That way, the right application or execution can be performed on the file.

That’s why it so easy to change the file extensions on Windows to fool the system into executing a program file that is “hoodwinked” as a text file. (executable.exe.txt)

In this case, the user cannot discern the malicious “txt” file and upon opening it, the system executes it or (if it relies on file extensions) opens it as a text file that has corrupted data.

So, at this point, what you should know is that file extensions are heavily relied upon by users or applications to determine the type and format.

That’s how you are able to know if a file is an image by looking at the (.jpg, .jpeg, .png, etc) extensions. The same goes for documents (.pdf, .docx, .txt, .xlsx, etc), videos (.mp4, .avi, .mov, .mkv, etc), audio (.wav, .mp3, .aac, etc), and programming language-specific files (.py, .js, .cpp, etc)

However, when it comes to Linux, file extensions are not heavily relied upon to determine the file type and the action to take upon sending an open command by double-clicking on a file.

How does Linux examine a file?

Linux determines the file type and the action to take when opening files using content-based file identification.

Content-based file identification heavily relies on the metadata contained in a file to determine the appropriate application or action to take with a specific file. The identification approach relies on MIME types and file signatures that use magic numbers.

MIME types file identification on Linux

MIME (Multipurpose Internet Mail Extensions) types are file classification approaches that base files on their nature and format.

As it is classification-based, Linux relies on a database to check the association of a file to a particular MIME type.

So, before opening a file, Linux examines the contents of the file to check its MIME type, then cross-check the file type associated with that MIME type, finally, Linux is able to take the right action or application intended to manipulate, open, or execute the file.

So, if a MIME type extracted from a file is “text/plain”, Linux is able to determine that that file is a .txt file and it can be opened with a text editor application.

Here are the most common MIME types and their file associations on Linux:

MIME TypeFile Association
text/plain.txt, .log, .cfg, .conf, .sh, .c, .cpp
application/pdf.pdf
application/json.json
application/xml.xml
application/zip.zip
application/gzip.gz, .tar.gz, .tgz
application/x-bzip2.bz2
application/x-tar.tar
application/x-rar.rar
application/x-7z-compressed.7z
image/jpeg.jpg, .jpeg
image/png.png
image/gif.gif
image/bmp.bmp
audio/mpeg.mp3
audio/ogg.ogg
video/mp4.mp4
video/quicktime.mov
video/webm.webm
application/msword.doc, .docx
application/vnd.ms-excel.xls, .xlsx
application/vnd.openxmlformats-officedocument.presentationml.presentation.ppt, .pptx
application/rtf.rtf
application/zip.zip
application/x-tar.tar
application/x-gzip.gz
application/x-bzip2.bz2
application/x-7z-compressed.7z
application/octet-streamNo specific file extension
MIME types and their file associations

If you want to check on the file type based on MIME types on Linux, you can use the xdg-mime command.

How to check the MIME type of a file using a Linux terminal

Step 1: Open the terminal using CTRL + ALT + T

Linux Terminal window

Step 2: Use the command xdg-mime query filetype /path-to-your-file to check the MIME type of a file.

For example, to check on the MIME type of a .txt file, you should use the following command:

xdg-mime query filetype file.txt

The same for Python files

xdg-mime query filetype main.py

You should get results like this:

File identification on Linux based on MIME types on Linux example

Use of magic numbers to identify files on Linux

The other approach to content-based file identification on Linux is retrieving the magic numbers present at the beginning of the bytes of a file.

Magic numbers, which are also known as file signatures, are sequence of bytes at the beginning of the bytes making up the file that uniquely identify the format and type of a file.

So, as you know or not, a file is made up of a sequence of bytes arranged in an organized manner. These bytes are in the form of 1s and 0s.

Content-based file identification relying on magic numbers exploits this nature of sequence of bytes to add additional bytes at the beginning of the bytes that make up a file.

These file signatures in the form of bytes help uniquely identify the file type and format and it is heavily relied upon on Linux systems.

So, it works like this,

A file is accesses, the system reads the first few bytes of the file to extract the magic numbers.

The extracted magic bytes are cross-referenced against a database of know file formats and their associated types.

Based on the match, Linux is able to identify a file’s format, its type, and how it should be handled.

With such a use of file signatures, Linux is able to outrightly identify file types even when their file extensions look compromised as with the case of the executable disguised as a text file example with Windows.

A perfect Linux command that relies on magic numbers to identify file or MIME types of a file is the path command.

Here’s how to use to determine the file types of a file.

Open the Terminal.

Use the command file /path-to-your-file

file file.txt

In the example above, you should get the file type of the text file to be ASCII Text.

To recap,

Linux does not rely on file extensions to identify file types of a file. Instead, it relies on more reliable content-based approaches such as MIME types and file signatures.

Why content-based file identification is better than using extensions to identify files

Content-based file identification approach is more accurate because it examines the actual data present within a file. Exploits such as spoofed file extensions cannot happen with file-based identification.

Relying on content-based file identification leads to enhanced security as some malicious files cannot be opened based on deceptive extensions.

Content-based file identification is more flexible at handling new and unknown file types. Besides, very flexible at opening files with no extensions.

Can a file be without an extension?

A file can be without an extension. For example, you can have a text file with the name, ‘myfile’, without the extension, .txt. Linux can handle such a file appropriately because it does not rely on file extensions to access or execute a file.

However, some applications can incorrectly handle and access a file with no extension. In such a case, you may get meaningless mumbled data or an “unsupported file error”

To know the file type of a file with no extension on Linux, you can use the file or xdg-mime file utilities.

Using file utility to get the file type of a file with no extension:

Open the Terminal and execute the following command:

file file-with-no-extension

Without the file extension at the end of my file name, file utility is able to return the correct file type of the file, `file-with-no-extension.

The same happens with using xdg-mime utility to get the file type of a file with no extension

xdg file-with-no-extension

The command above produces the result:

text/plain

However, it is good practice to name your file with file extensions.

Best practices for file naming and extensions in Linux

  1. Avoid creating files with no extensions. Writing programs, if you are a programmer, will require you to reference file extensions. Besides, file extensions help identify its type and the appropriate application to open it with at first glance.
  2. Use appropriate file extensions to name your files. Name your files depending on the type of data they hold. Text files should have .txt extensions as an example.
  3. Use file extensions to avoid ambiguity in naming multiple files that may share the same name. A descriptive file extension should help prevent file overwrites.
  4. Consistent naming with meaningful extensions helps in categorizing and organizing files more efficiently.

Most common Linux file extensions to use

Although Linux does not use file extensions to determine the file type of a file, humans do. Thus, it is essential to name your files with an ending file extension to make other people who use your computer or even yourself identify if the file they are browsing is a text file, video file, document, or image.

Besides, file extensions are important in version control systems and when collaborating with other programmers.

Here are file extensions that you can use on Linux to name your files depending on the data they hold and their intended purpose.

File ExtensionFile intent/use or data
.debFile extension for Linux executable
.txtPlain Text File
.doc, .docxMicrosoft Word Document
.rtfRich Text Format
.odtOpenDocument Text (LibreOffice, OpenOffice)
.xls, .xlsxMicrosoft Excel Spreadsheet
.odsOpenDocument Spreadsheet (LibreOffice, OpenOffice)
.ppt, .pptxMicrosoft PowerPoint Presentation
.odpOpenDocument Presentation (LibreOffice, OpenOffice)
.jpg, .jpegJPEG Image
.pngPortable Network Graphics Image
.gifGraphics Interchange Format Image
.bmpBitmap Image
.svgScalable Vector Graphics Image
.mp3MP3 Audio File
.wavWaveform Audio File Format
.oggOgg Vorbis Audio File
.mp4MPEG-4 Video File
.aviAudio Video Interleave File
.mkvMatroska Video File
.movQuickTime Video File
.zipZip Archive
.tarTape Archive
.gzGzip Compressed Archive
.bz2Bzip2 Compressed Archive
.7z7-Zip Compressed Archive
.exeWindows Executable
.shShell Script (Linux Executable)
.rpmRed Hat Package Manager Package
.c, .cppC/C++ Source Code
.javaJava Source Code
.pyPython Script
.html, .htmHTML Web Page
.cssCascading Style Sheet
.jsJavaScript File
.sqliteSQLite Database File
.dbDatabase File (Generic)
.confConfiguration File (Generic)
.iniINI Configuration File
This table covers the most common file extensions and their associated functions or data types

How to display file extensions when listing files on a Linux terminal

To display the file extensions of files on a Linux terminal, you use ls -l or ls -p commands. The -l and -p flags are used in conjunction with the ls utility that is used for listing files on Linux.

Here’s how to list the files and their file extensions on a Linux terminal:

Step 1: Open the Terminal on your Linux machine.

Linux Terminal window

Step 2: Navigate into the directory you want to list your files using the cd command.

Step 3: Use the ls -l command to list the files in the directory, including their file extensions

ls -l

The command will display an exhaustive list of files and directories in the current directory with their names and respective file extensions.

Step 4: Use ls -p for a concise list of files and their file names

To display only the file names and their file extensions without additional file metadata, you can use ls -p command.

ls -p
ls -p command for displaying file names with their file extensions on Linux Terminal

… and that’s it!

If you are new to Linux, you should definitely check out this guide I have written for you:

How can beginners learn to code in Linux

Share your love
Badi
Badi

Badi here, creator of ngangasn.com— A website dedicated to providing helpful information and how-to's of web development and hosting. Inspired by a fascination to write the most efficient code to make a computer laugh, & humans, Steve has a passion for organizing characters to create code and informative content.
What makes me happy?
Well, nothing like the feeling of finally figuring out that one pesky bug that's been driving me crazy.

Leave a Reply

Your email address will not be published. Required fields are marked *