Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix memory leak in queue #1488

Merged
merged 1 commit into from
Oct 17, 2024
Merged

Fix memory leak in queue #1488

merged 1 commit into from
Oct 17, 2024

Conversation

ankur22
Copy link
Collaborator

@ankur22 ankur22 commented Oct 16, 2024

When we read off the read queue, we alias the slice. What has been forgotten though is to remove the reference to the entry in the slice after it has been read. This fixes that by overwriting the entry with a new Event. This now allows the GC to pick up the buffered data that the event was holding onto.

What?

This is removing the reference to the data that is no longer needed once the Event has been sent and read off the queue,. but before the read queue is aliased.

Why?

I remember having this complicated write/read slice in the emitter queue. The reason we needed this to keep the incoming CDP messages in order, otherwise they would go out of order (PR). The issue is that once the event is read off the read queue, the read slice is aliased, but the event is still held in memory in the underlying array!

By overwriting the entry before the alias the data that was being held onto is now free to be picked up by the GC.

How to Test

Testing this change requires the following to be performed:

  1. First we need to see the memory leak in main, so checkout main;
  2. You will need the following test script:
mem-test.js
import { browser } from "k6/browser";

export const options = {
  scenarios: {
    ui: {
      iterations: 1,
      executor: "shared-iterations",
      maxDuration: '120m',
      options: {
        browser: {
          type: "chromium",
        },
      },
    },
  }
};

export default async function () {
  const page = await browser.newPage();

  for (var i = 0; i < 10; i++) {
    await page.goto("http://2022wfh.ddns.net/test", { waitUntil: "networkidle" });
  }

  await page.close();
}
  1. Run the following script from the root of the project directory with ./mem_bench.sh on_main.csv mem-test.js:
Bash script to run and keep track of memory usage
#!/bin/bash

# Usage: ./mem_bench.sh <total.csv> <test.js>

# Check if at least two arguments are provided
if [ -z "$2" ]; then
  echo "Usage: $0 <total.csv> <test.js>"
  exit 1
fi

total_output_file="$1"

# Write the CSV header
echo "Total" > "$total_output_file"

# Run the test in the background
K6_BROWSER_ENABLE_PPROF=true xk6 run $2 &

# Give xk6 time to run, build a new binary, and then run k6.
sleep 4

# Capture the PID of the background process
K6_PID=$!
echo "K6 process started with PID: $K6_PID"

while true; do
  sleep 1

  # Run curl and capture the exit status
  curl http://localhost:6060/debug/pprof/heap --output pprof-heap-1
  if [ $? -ne 0 ]; then
    echo "Curl command failed. Exiting loop."
    break
  fi

  # Retrieve the total memory used and convert to bytes
  memory=$(go tool pprof --top pprof-heap-1 | awk 'NR == 4 {print $8}')

  # Extract the number and the unit using sed instead of grep
  mem_number=$(echo "$memory" | sed -E 's/([0-9.]+)[a-zA-Z]+/\1/')  # Extract the numeric part
  mem_unit=$(echo "$memory" | sed -E 's/[0-9.]+([a-zA-Z]+)/\1/')    # Extract the unit part

  # Convert memory to bytes
  case $mem_unit in
    kB|KB)
      mem_bytes=$(echo "$mem_number * 1024" | bc)  # 1 KB = 1024 bytes
      ;;
    MB|mb|Mb)
      mem_bytes=$(echo "$mem_number * 1024 * 1024" | bc)  # 1 MB = 1024 KB = 1024 * 1024 bytes
      ;;
    GB|gb|Gb)
      mem_bytes=$(echo "$mem_number * 1024 * 1024 * 1024" | bc)  # 1 GB = 1024 MB = 1024^3 bytes
      ;;
    TB|tb|Tb)
      mem_bytes=$(echo "$mem_number * 1024 * 1024 * 1024 * 1024" | bc)  # 1 TB = 1024 GB = 1024^4 bytes
      ;;
    *)
      mem_bytes=$mem_number  # If no unit is present, assume it's already in bytes
      ;;
  esac

  # Append the memory in bytes to the output file
  echo "$mem_bytes" >> "$total_output_file"

done

# Optionally wait for the K6 process to finish
wait $K6_PID
echo "K6 process with PID: $K6_PID has finished."

# newline
echo

# Calculate the average memory in bytes
if [ -f "$total_output_file" ]; then
  # Use awk to sum, find the min, and find the max values
  stats=$(awk 'NR > 1 {
    sum += $1;
    if (min == "" || $1 < min) min = $1;
    if ($1 > max) max = $1;
    count++;
  } END {
    printf "%.0f %.0f %.0f %d\n", sum, min, max, count
  }' "$total_output_file")

  # Split the stats into individual variables
  total_bytes=$(echo $stats | awk '{print $1}')
  min_bytes=$(echo $stats | awk '{print $2}')
  max_bytes=$(echo $stats | awk '{print $3}')
  num_entries=$(echo $stats | awk '{print $4}')

  # Debugging output
  echo "Total bytes: $total_bytes"
  echo "Min bytes: $min_bytes"
  echo "Max bytes: $max_bytes"
  echo "Number of entries: $num_entries"

  # Use bc with scale to handle floating-point division
  average_bytes=$(echo "scale=2; $total_bytes / $num_entries" | bc)
  echo "Average memory in bytes: $average_bytes"
fi
  1. When it completes run go tool pprof -http=":8002" pprof-heap-1
  2. This will open a new tab in chrome with the inuse_space already selected in the graph view, go to http://localhost:8002/ui/flamegraph;
  3. You should see something like:
Screenshot 2024-10-16 at 16 48 05
  1. Now we do the same against the fix branch, checkout fix/memory-leak-in-event-queue;
  2. Follow the same instructions as above from step 3 and you should end up with something like the screenshot below, where io.ReadAll hasn't grown:
Screenshot 2024-10-16 at 16 49 09

Checklist

  • I have performed a self-review of my code
  • I have added tests for my changes
  • I have commented on my code, particularly in hard-to-understand areas

Related PR(s)/Issue(s)

Updates: #1480

When we read off the read queue, we alias the slice. What has been
forgotten though is to remove the reference to the entry in the slice
after it has been read. This fixes that by overwriting the entry with a
new Event. This now allows the GC to pick up the buffered data that the
event was holding onto.
Copy link
Member

@inancgumus inancgumus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice bit of debugging 👏

So, my understanding is Event.data is any and the queue.read slice was holding onto the Event that was read. Since any can be of any pointer that points to any value, it can clearly leak.

@ankur22
Copy link
Collaborator Author

ankur22 commented Oct 17, 2024

So, my understanding is Event.data is any and the queue.read slice was holding onto the Event that was read. Since any can be of any pointer that points to any value, it can clearly leak.

Yeah, that's correct. I'm wondering if the other parts of the queue/handler in the emitter need to be looked at too for similar issues.

@ankur22 ankur22 merged commit 58f9bc3 into main Oct 17, 2024
23 checks passed
@ankur22 ankur22 deleted the fix/memory-leak-in-event-queue branch October 17, 2024 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants