Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement telfhash for ELF import table #936

Merged
merged 21 commits into from
Apr 14, 2021
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
4bb8979
Implement telfhash for import table and add TLSH to the project
HoundThe Mar 20, 2021
6505da0
comment the import symbol filter regexes
HoundThe Mar 23, 2021
2e5d12c
Use std::set for faster lookup
HoundThe Mar 23, 2021
c27d3d7
Address code review comments
HoundThe Mar 30, 2021
0ec716c
better formatting
HoundThe Mar 30, 2021
71d31cc
Move TLSH to deps/ using cmake
HoundThe Apr 1, 2021
6611b0f
Forgot to commit tlsh headers
HoundThe Apr 1, 2021
7d5382d
Restructure elf_format to get symbols in the same manner as telfhash
HoundThe Apr 1, 2021
53ac63e
Ignore symbols from dynamic segments
HoundThe Apr 2, 2021
9fca602
First exclude then convert to lower_case
HoundThe Apr 2, 2021
3d1720d
mask out symbol visibility from others
HoundThe Apr 2, 2021
1209eb7
Move telfhash outside import table to elf_format, use TLSH for all im…
HoundThe Apr 6, 2021
7321cf0
Fix uninitialized value
HoundThe Apr 6, 2021
052432a
Fixed TLSH build on Windows
HoundThe Apr 6, 2021
536bef1
fileformat/CMakeLists.txt: do not add tlsh-related stuff
PeterMatula Apr 13, 2021
711463d
deps/tlsh: refactor CMake
PeterMatula Apr 13, 2021
6299548
cmake/options.cmake: move TLSH to deps section
PeterMatula Apr 14, 2021
10217f7
deps/tlsh/cmake: add new line at the end
PeterMatula Apr 14, 2021
9c175a3
fileformat/elf_format: C comment -> C++ comment
PeterMatula Apr 14, 2021
6ddeb7e
fileformat/elf_import_table.h: add missing new line
PeterMatula Apr 14, 2021
c5c91b0
fileformat: remove trailing spaces
PeterMatula Apr 14, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions include/retdec/fileformat/file_format/elf/elf_format.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

#include "retdec/fileformat/file_format/file_format.h"
#include "retdec/fileformat/types/note_section/elf_notes.h"
#include "retdec/fileformat/types/import_table/elf_import_table.h"
PeterMatula marked this conversation as resolved.
Show resolved Hide resolved

namespace retdec {
namespace fileformat {
Expand Down
18 changes: 18 additions & 0 deletions include/retdec/fileformat/types/import_table/elf_import_table.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
/**
* @file include/retdec/fileformat/types/import_table/elf_import_table.h
* @brief Class for ELF import table.
* @copyright (c) 2021 Avast Software, licensed under the MIT license
*/

#include "import_table.h"

namespace retdec {
namespace fileformat {

class ElfImportTable : public ImportTable
{
public:
void computeHashes() override;
};
} // namespace fileformat
} // namespace retdec
HoundThe marked this conversation as resolved.
Show resolved Hide resolved
9 changes: 6 additions & 3 deletions include/retdec/fileformat/types/import_table/import_table.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@
#include <vector>

#include "retdec/fileformat/types/import_table/import.h"

HoundThe marked this conversation as resolved.
Show resolved Hide resolved
namespace retdec {
namespace fileformat {

Expand All @@ -20,14 +19,15 @@ namespace fileformat {
*/
class ImportTable
{
private:
protected:
using importsIterator = std::vector<std::unique_ptr<Import>>::const_iterator;
std::vector<std::string> libraries; ///< name of libraries
std::vector<std::string> missingDeps; ///< missing dependencies
std::vector<std::unique_ptr<Import>> imports; ///< stored imports
std::string impHashCrc32; ///< imphash CRC32
std::string impHashMd5; ///< imphash MD5
std::string impHashSha256; ///< imphash SHA256
std::string impHashTlsh;
public:
/// @name Getters
/// @{
Expand All @@ -39,6 +39,7 @@ class ImportTable
const std::string& getImphashCrc32() const;
const std::string& getImphashMd5() const;
const std::string& getImphashSha256() const;
const std::string& getImpHashTlsh() const;
const std::vector<std::string> & getMissingDependencies() const;

std::string getLibrary(std::size_t libraryIndex) const;
Expand All @@ -55,7 +56,7 @@ class ImportTable

/// @name Other methods
/// @{
void computeHashes();
virtual void computeHashes();
void clear();
void addLibrary(std::string name, bool missingDependency = false);
void addImport(std::unique_ptr<Import>&& import);
Expand All @@ -70,6 +71,8 @@ class ImportTable
void dump(std::string &dumpTable) const;
void dumpLibrary(std::size_t libraryIndex, std::string &libraryDump) const;
/// @}

virtual ~ImportTable() = default;
};

} // namespace fileformat
Expand Down
183 changes: 183 additions & 0 deletions include/retdec/fileformat/utils/tlsh/tlsh.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,183 @@
// tlsh.h - TrendLSH Hash Algorithm

/*
* TLSH is provided for use under two licenses: Apache OR BSD.
* Users may opt to use either license depending on the license
* restictions of the systems with which they plan to integrate
* the TLSH code.
*/

/* ==============
* Apache License
* ==============
* Copyright 2013 Trend Micro Incorporated
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

/* ===========
* BSD License
* ===========
* Copyright (c) 2013, Trend Micro Incorporated
* All rights reserved.
*
* Redistribution and use in source and binary forms, with or without modification,
* are permitted provided that the following conditions are met:
*
* 1. Redistributions of source code must retain the above copyright notice, this
* list of conditions and the following disclaimer.
*
* 2. Redistributions in binary form must reproduce the above copyright notice,
* this list of conditions and the following disclaimer in the documentation
* and/or other materials provided with the distribution.

* 3. Neither the name of the copyright holder nor the names of its contributors
* may be used to endorse or promote products derived from this software without
* specific prior written permission.
*
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND
* ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
* WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
* IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT,
* INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING,
* BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF
* LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
* OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED
* OF THE POSSIBILITY OF SUCH DAMAGE.
*/

#ifndef HEADER_TLSH_H
#define HEADER_TLSH_H

#if defined WINDOWS || defined MINGW
#include "win_version.h"
#else
#include "version.h"
#endif

#ifndef NULL
#define NULL 0
#endif

#ifdef __cplusplus

class TlshImpl;

// Define TLSH_STRING_LEN_REQ, which is the string length of "T1" + the hex value of the Tlsh hash.
// BUCKETS_256 & CHECKSUM_3B are compiler switches defined in CMakeLists.txt
#if defined BUCKETS_256
#define TLSH_STRING_LEN_REQ 136
// changed the minimum data length to 256 for version 3.3
#define MIN_DATA_LENGTH 50
// added the -force option for version 3.5
// added the -conservatibe option for version 3.17
#define MIN_CONSERVATIVE_DATA_LENGTH 256
#endif

#if defined BUCKETS_128
#define TLSH_STRING_LEN_REQ 72
// changed the minimum data length to 256 for version 3.3
#define MIN_DATA_LENGTH 50
// added the -force option for version 3.5
// added the -conservatibe option for version 3.17
#define MIN_CONSERVATIVE_DATA_LENGTH 256
#endif

#if defined BUCKETS_48
// No 3 Byte checksum option for 48 Bucket min hash
#define TLSH_STRING_LEN 30
// changed the minimum data length to 256 for version 3.3
#define MIN_DATA_LENGTH 10
// added the -force option for version 3.5
#define MIN_CONSERVATIVE_DATA_LENGTH 10
#endif

#define TLSH_STRING_BUFFER_LEN (TLSH_STRING_LEN_REQ+1)

#ifdef WINDOWS
#include <WinFunctions.h>
#else
#if defined(__SPARC) || defined(_AS_MK_OS_RH73)
#define TLSH_API
#else
#define TLSH_API __attribute__ ((visibility("default")))
#endif
#endif

class TLSH_API Tlsh{

public:
Tlsh();
Tlsh(const Tlsh& other);

/* allow the user to add data in multiple iterations */
void update(const unsigned char* data, unsigned int len);

/* to signal the class there is no more data to be added */
void final(const unsigned char* data = NULL, unsigned int len = 0, int fc_cons_option = 0);

/* to get the hex-encoded hash code */
const char* getHash(int showvers=0) const ;

/* to get the hex-encoded hash code without allocating buffer in TlshImpl - bufSize should be TLSH_STRING_BUFFER_LEN */
const char* getHash(char *buffer, unsigned int bufSize, int showvers=0) const;

/* to bring to object back to the initial state */
void reset();

// access functions
int Lvalue();
int Q1ratio();
int Q2ratio();
int Checksum(int k);
int BucketValue(int bucket);

/* calculate difference */
/* The len_diff parameter specifies if the file length is to be included in the difference calculation (len_diff=true) or if it */
/* is to be excluded (len_diff=false). In general, the length should be considered in the difference calculation, but there */
/* could be applications where a part of the adversarial activity might be to add a lot of content. For example to add 1 million */
/* zero bytes at the end of a file. In that case, the caller would want to exclude the length from the calculation. */
int totalDiff(const Tlsh *, bool len_diff=true) const;

/* validate TrendLSH string and reset the hash according to it */
int fromTlshStr(const char* str);

/* check if Tlsh object is valid to operate */
bool isValid() const;

/* display the contents of NOTICE.txt */
static void display_notice();

/* Return the version information used to build this library */
static const char *version();

// operators
Tlsh& operator=(const Tlsh& other);
bool operator==(const Tlsh& other) const;
bool operator!=(const Tlsh& other) const;

~Tlsh();

private:
TlshImpl* impl;
};

#ifdef TLSH_DISTANCE_PARAMETERS
void set_tlsh_distance_parameters(int length_mult_value, int qratio_mult_value, int hist_diff1_add_value, int hist_diff2_add_value, int hist_diff3_add_value);
#endif

#endif

#endif

Loading