浏览器取证工具开发报告

摘要

本报告记录了基于 Python 的浏览器取证工具的过程，该工具用于从 Chrome 和 Firefox 浏览器中提取历史记录，并生成 PDF 格式的报告。主要包括环境设置、数据提取、数据分析及报告生成等关键步骤。利用此工具，我们可以有效地获取并分析浏览器的记录包含的信息，以便在取证分析中使用。

1. 引言

随着互联网的普及，浏览器成为了用户访问互联网的主要工具，同时也是记录用户上网行为的重要工具。在数字取证中，浏览器历史记录可以提供有价值的线索。本项目的目标是开发一个工具，从 Chrome 和 Firefox 中提取用户的浏览历史，并生成易于阅读的 PDF 报告。

2. 环境设置

2.1 开发环境

操作系统: Windows 10
编程语言: Python 3.8+
依赖库:
- sqlite3: 用于访问 SQLite 数据库。
- reportlab: 用于生成 PDF 报告。
- pandas: 用于数据分析和处理。
- argparse: 用于命令行参数解析。

2.2 安装依赖

确保已安装所有必要的 Python 库。使用以下命令安装依赖：

1	pip install reportlab pandas

3.使用方法及成果展示

使用方法

在不知道如何使用的情况下，可以在终端中键入

1	python evi_hw1.py -h

来获取帮助，执行需要两个参数——浏览器信息及生成pdf文件名称，默认生成在当前路径文件夹下。

4. 数据提取

浏览器历史记录通常存储在 SQLite 数据库中。Chrome 和 Firefox 都使用这种格式存储历史记录。因此，我们可以使用 sqlite3 库来访问这些数据库。

4.1 关闭浏览器

为了避免文件锁定问题，我们需要在提取数据之前确保浏览器已关闭。使用 subprocess 模块关闭浏览器：

def close_browser(browser_name):
    try:
        while True:
            result = subprocess.run(['tasklist'], stdout=subprocess.PIPE, text=True)
            if browser_name in result.stdout:
                subprocess.run(["taskkill", "/F", "/IM", browser_name], check=True)
                print(f"{browser_name} 进程被关闭。")
                time.sleep(1)
            else:
                print(f"{browser_name} 浏览器已全部关闭，程序将继续执行。")
                break
    except subprocess.CalledProcessError as e:
        print(f"关闭浏览器时发生错误: {e}")

4.2 提取 Chrome 历史记录

Chrome 历史记录的存储

Chrome 的浏览历史记录存储在一个名为 History 的 SQLite 数据库文件中，通常位于用户目录的 AppData 文件夹下：

1	C:\Users\<username>\AppData\Local\Google\Chrome\User Data\Default\History

代码解析

为了提取 Chrome 历史记录，我们需要首先关闭 Chrome 浏览器，因为它会锁定 History 文件，阻止其他程序访问。以下是主要步骤：

def get_chrome_history():
    close_browser('chrome.exe')
    time.sleep(2)
    history_path = os.path.expanduser('~') + "/AppData/Local/Google/Chrome/User Data/Default/History"
    temp_history_path = os.path.expanduser('~') + "/AppData/Local/Temp/History"
    shutil.copyfile(history_path, temp_history_path)
    conn = sqlite3.connect(temp_history_path)
    cursor = conn.cursor()
    cursor.execute("SELECT url, title, visit_count, last_visit_time FROM urls")
    rows = cursor.fetchall()
    conn.close()
    return [(row[0], row[1], row[2], convert_chrome_time(row[3])) for row in rows]

关闭浏览器: 调用 close_browser('chrome.exe') 关闭所有 Chrome 浏览器进程，避免文件被锁定。
等待: 使用 time.sleep(2) 确保进程完全关闭。
复制文件: 将 History 文件复制到临时目录，确保不影响原文件。
读取数据: 连接到临时文件的 SQLite 数据库，并执行查询以提取 URL、标题、访问次数和最后访问时间。
转换时间: 使用 convert_chrome_time 函数将 Chrome 的时间戳转换为标准日期时间格式。

4.3 提取 Firefox 历史记录

Firefox 历史记录的存储

Firefox 的浏览历史记录存储在 places.sqlite 文件中，通常位于用户目录的 AppData 文件夹下：

1	C:\Users\<username>\AppData\Roaming\Mozilla\Firefox\Profiles\<profile>\places.sqlite

其中 <profile> 是用户的 Firefox 配置文件名，通常以 .default-release 或 .default-release-1 结尾。

代码解析

为了提取 Firefox 历史记录，首先要确定当前用户的 Firefox 配置文件路径，并关闭 Firefox 浏览器。代码的主要步骤如下：

def get_firefox_history():
    close_browser('firefox.exe')
    time.sleep(2)
    profile_path = os.path.expanduser('~') + "/AppData/Roaming/Mozilla/Firefox/Profiles/"
    profile_folders = os.listdir(profile_path)
    profile_folder = [f for f in profile_folders if f.endswith('.default-release') or f.endswith('.default-release-1')][0]
    history_path = f"{profile_path}{profile_folder}/places.sqlite"
    conn = sqlite3.connect(history_path)
    cursor = conn.cursor()
    cursor.execute("SELECT url, title, visit_count, last_visit_date FROM moz_places")
    rows = cursor.fetchall()
    conn.close()
    return [(row[0], row[1], row[2], convert_firefox_time(row[3])) for row in rows]

关闭浏览器: 调用 close_browser('firefox.exe') 关闭所有 Firefox 浏览器进程，避免文件被锁定。
选择配置文件夹: 从用户的 Firefox 配置文件目录中选择一个默认的配置文件夹，通常以 .default-release 结尾。
读取数据: 连接到 places.sqlite 文件的 SQLite 数据库，提取浏览记录数据。
转换时间: 使用 convert_firefox_time 函数将 Firefox 的时间戳转换为标准日期时间格式。

时间格式转换

Chrome 和 Firefox 使用不同的时间戳格式。Chrome 的时间戳是自 1601 年以来的微秒数，Firefox 的时间戳是自 1970 年以来的微秒数。为了将这些时间戳转换为人类可读的日期，我们使用了以下函数：

def convert_chrome_time(chrome_time):
    return datetime.datetime(1601, 1, 1) + datetime.timedelta(microseconds=chrome_time)

def convert_firefox_time(firefox_time):
    return datetime.datetime(1970, 1, 1) + datetime.timedelta(microseconds=firefox_time)

这些函数将浏览器的时间戳转换为标准日期时间格式，便于后续的分析和报告生成。

def convert_chrome_time(chrome_time):
    try
        if chrome_time is None:
    except OverflowError:
        print("Invalid date value:", chrome_time)
        return None

def convert_firefox_time(firefox_time):
    try:
        if firefox_time is None:
            return None
        return datetime.datetime(1970, 1, 1) + datetime.timedelta(microseconds=firefox_time)
    except OverflowError:
        print("Invalid date value:", firefox_time)
        return None

5. 报告生成

使用 reportlab 库生成 PDF 报告。报告中包括 URL、标题、访问次数和最后访问时间。

5.1 PDF 报告生成

创建一个包含表格的 PDF 文件，以清晰地展示提取到的历史记录：

def generate_pdf_report(data, filename):
    doc = SimpleDocTemplate(filename, pagesize=letter, rightMargin=30, leftMargin=30, topMargin=30, bottomMargin=18)
    elements = []
    
    styles = getSampleStyleSheet()
    table_style = TableStyle([
        ('GRID', (0, 0), (-1, -1), 1, colors.black),
        ('FONTNAME', (0, 0), (-1, -1), 'Helvetica'),
        ('FONTSIZE', (0, 0), (-1, -1), 10),
        ('VALIGN', (0, 0), (-1, -1), 'TOP'),
        ('BACKGROUND', (0, 0), (-1, 0), colors.lightgrey),
        ('TEXTCOLOR', (0, 0), (-1, 0), colors.whitesmoke),
        ('ALIGN', (0, 0), (-1, -1), 'LEFT'),
        ('LEFTPADDING', (0, 0), (-1, -1), 6),
        ('RIGHTPADDING', (0, 0), (-1, -1), 6),
        ('TOPPADDING', (0, 0), (-1, -1), 3),
        ('BOTTOMPADDING', (0, 0), (-1, -1), 3),
    ])
    
    data_for_table = [['URL', 'Title', 'Visits', 'Last Visited']]
    for url, title, visits, last_visited in data:
        if last_visited is not None:
            last_visited = last_visited.strftime('%Y-%m-%d %H:%M:%S')
        else:
            last_visited = 'Never'
        if title is None:
            title = "No title"
        else:
            title = Paragraph(title.encode('latin1', 'replace').decode('latin1'),styles['Normal'])  # Handle encoding issues
        shortened_url = shorten_url(url)
        link = f'<a href="{html.escape(url)}">{html.escape(shortened_url)}</a>'
        data_for_table.append([Paragraph(link, styles['Normal']), title, str(visits), last_visited])

    t = Table(data_for_table, colWidths=[2.7*inch, 1.8*inch, 0.7*inch, 1.8*inch])
    t.setStyle(table_style)
    elements.append(t)
    
    doc.build(elements)

def shorten_url(url, max_length=50):
    if len(url) <= max_length:
        return url
    else:
        return url[:max_length-3] + "..."

为方便观察，将生成的 pdf 报告以表格的方式呈现，但是由于网站可能过长，表格第一栏与第二栏有时便不可避免地会发生越格现象，我通过

合理的栏长度分配；
精简 url 长度使得多余部分以省略号显示（在保证url信息不丢失的基础上）；
将 title 包裹在Paragraph中，使其能够自动换行。

解决了上述问题。

5. 结论

通过本次项目开发，我们成功创建了一个能够提取 Chrome 和 Firefox 浏览器历史记录的取证工具。该工具能够有效地提取用户的浏览历史，并生成 PDF 报告，为后续的取证分析提供了便利。本项目展示了使用 Python 进行数据提取和报告生成的过程，同时也为未来进一步开发和优化取证工具提供了良好的基础。

参考文献

Python 官方文档 https://docs.python.org/3/
sqlite3 模块文档 https://docs.python.org/3/library/sqlite3.html
reportlab 库文档 https://www.reportlab.com/docs/reportlab-userguide.pdf