MediaPipe v0.10.13
Build changes
- Make Holistic C++ graph public until we have a C++ API
- Added a test image to tasks/testdata/vision
- Update dependency in inference_calculator_metal to make it OSS compatible
- Add build rule for gpt2_unicode_mapping_calculator to ODML repo
- Update TF patch to match new version
- Make model_asset_bundle_resources public
Framework and core calculator improvements
- Added Interactive Segmenter C Tasks API and updated Image Segmenter + Pose Landmarker API/tests
- Moved some utility functions used by the segmenter APIs to a shared test namespace
- Added Face Stylizer C API
- Add config options to disable default service in mediapipe vision tasks.
- Fix race condition in GetCurrentThreadId
- Update base Docker image to Ubuntu 22.04
- Adding support for boolean tensor inputs to InferenceInterpreterDelegateRunner
- Updates
text_embedder_cosing_similarity
signature to use Embedding pointers - Fix mediapipe/framework/packet.h build failure on C++20.
- Finish allowing "direct Tensor" inputs and outputs in all InferenceCalculator variants.
- Add SizeInTokens API to C layer
- Remove dependency on "torch" for MediaPipe Python package
- Add option for allowing cropping beyond image borders in ContentZoomingCalculator
- Update XNNPACK
- Add support for loading models from memory mapped files
- InferenceCalculator: Add option to use mmap for model loading
- Workaround the flaky status of XNN_FLAG_KEEP_DIMS
- Add an HasError method and a test for ErrorReporter
- Update cached_kernel_path option doc
- Adds MemoryManager to several Tensor-generating calculators
- Add support for mmapping models to more inference calculators
- Removes InferenceRunner interface from InferenceCalculatorNodeImpl
- ContentZoomingCalculator: Fix initial state for "last measured" rect
- Reduce memory usage of LLM Web API
- Propagate packet timestamps to Android surfaces
- Make previous_log_index_ atomic and to fix some of race condition issues in Mediapipe Profiler
- Adds MemoryManager to TensorConverter Calculator
- Fix template_parser's crash when destructing stowed_messages_ for proto3.
- ContentZoomingCalculator: Don't clamp when
allow_cropping_outside_frame
is set - Refactor Metal path out of TensorsToSegmentationCalculator main file.
- Update Protobuf dependency to 4.x
- Expand AssetManager docs to provide JNI initialization method and proper usage patter through GetResourceContents.
- Enables reordering of input and output tensor streams in InferenceCalculators
- Add CopyCpuInputIntoTfLiteTensor
- Add CopyTfLiteTensorIntoCpuOutput
- Add int64_t to MP tensor.
- Make it clear sinks should outlive graph initialized with the corresponding config
- Update initializeNativeAssetManager docs - singleton + MediaPipe usage
- Deprecated ImageSource in favor of standard TexImageSource.
- Add support for additional tensor_data_type for tensor conversion calculators.
- Fix TensorsToSegmentationConverterMetal RunInGlContext().
- Change the naming of converters of ImageToTensorCalculator and TensorsToSegmentationCalculator.
- Update WebGL2 on OffscreenCanvas support check to include Safari 17+
- Use TextFormat for serialization
- Add
IsConnected()
to graph builder SideOut - Use "ahwb" prefix for "release_callback" to disambiguate ahwb vs. non ahwb callbacks.
- Allow multiple AHWB release callbacks.
- Add itemized loop calculators
- Add support for a Vector string packet to the constant_side_packet_calculator.
- Fix an issue in BeginItemLoopCalculator
- Allow arbitrary timestamp changes in BeginItemLoopCalculator
- Add unsigned int type to Mediapipe-Web binding.
- Add the ability to load a drishti graph template from a byte array.
- Add error handling to CreateSesssion in C API
- Report received dims size in the error.
- Adds conditional TFLITE_CONDITIONAL_NAMESPACE namespace to .cc implementations
- Adds support for tensor scalar output to VectorIntToTensorCalculator.
- Parse num classes per detection from TFLite_Detection_PostProcess op.
- Output error status int in case AHWB allocation fails.
- Support more types for inference_calculator_util tensor copying functions.
- Upgrade TensorFlow
- Fix ASAN error by removing tensor data filling for kNone in test.
- Added warning when MultiPoolOptions.keep_count is reached
- Updated the safetensor converter to support Gemma 7B mode
MediaPipe Tasks update
This section should highlight the changes that are done specifically for any platform and don't propagate to
other platforms.
Android
- Add sizeInTokens API to the Java LLM Inference Engine
- Set a default empty lora path for LlmInference.
iOS
- Updated iOS vision task runner to support tasks without norm rect stream
- Added iOS holistic landmarker result helpers and implementation
- Add async stream API to LlmInference for better Swift compatibility.
- Remove duplicate symbols from MediaPipeTasksGenAIC
- Apply iOS build fixes
- Revert avoid_deps in from MediaPipeTasksGenAI_framework and MediaPipeTasksGenAIC_framework.
- Fixed missing method in iOS vision task runner
- Fixed condition check in MPPVisionTaskRunner
- Fixed incorrect types in MPPHolisticLandmarkerResult
- Added init with proto utility to MPPHolisticLandmarkerResult
- Added MPPHolisticLandmarker helper for initialization from protobuf text file
- Added Holistic Landmarker Objective C Tests
- Added size in tokens API to iOS LlmInference
- Fixed type of holistic landmarker pose segmentation mask
- Added optional initialization of face blendshapes from protobuf file
- Added video mode and option tests to MPPHolisticLandmarker tests
- Updated documentation of MPPHolisticLandmarkerResult+Helpers.h
- Added iOS face stylizer implementation, options helpers, Result Helpers
- Updated iOS MPPImage Utils to support creation of output images from C++ RGB images
- Updated constants in MPPImage+Utils
- Added iOS Face Stylizer tests
- Updated documentation of iOS MPPFaceStylizer
- Added missing connections to iOS Pose Landmarker
- Added iOS MPPFloatBuffer, MPPFloatRingBuffer, ring buffer tests, MPPAudioData
- Update swift name of MPPAudioDataFormat
- Added live stream mode tests to iOS holistic landmarker
- Updated method signature in MPPHolisticLandmarkerResult+Helpers
- Added test for nil modelPath to face stylizer
- Fixed memory deallocation issues when creating images using MPPImage+Utils
- Exposed iOS Face Stylizer headers in xcframework build
- Added iOS MPPAudioRecord
- Updated method signature in MPPFloatRingBuffer
- Added audio error codes to MPPCommon.h
- Move MPPAudioDataFormat to a new file
- Updated method signature in MPPAudioRecord
- Added test utils for AVAudioPCMBuffer
- Added basic failure tests for MPPAudioRecord
- Add support for static LoRA on iOS.
- Updated AVAudioPCMBuffer convenience initializer to a class method
- Fix LlmTaskRunner.swift
- Added buffer loading tests to MPPAudioRecord
- Added method to load from audio record in MPPAudioData
- Fix LoRA integration in LlmInference.swift
- Add error handling to GenAI's Swift API
Javascript
- Add export to GenAI Fileset API
- Return the full string from the model
- Fix code snippet in NPM Readme
- Add Holistic Landmarker to NPM Readme
- Update npm README
- Add tokenizer normalization node to LLM web graph.
- Add Matrix to vision.d.ts
Python
- Fix API documentation link for ImageProcessing Options
- Disable text_embedder and text_classifier tests for Python
- Update safetensors converter for LoRA weights conversion for GEMMA 2B.
- Update model converter to support Phi-2 LoRA
- Mark Optional for landmark_drawing_spec argument
- Add LoRA options to converter.
Model Maker changes
- Keep tensorflow and tf-models-official to be <2.16. tensorflow-addons breaks with tensorflow 2.16.
- Read from default checkpoint path when training MobileBERT.
- Add checkpoint_frequency in model maker.
- Add repeat field in hyperparameters in model maker classifier.
- add mobilenet_v2 keras model spec.
- Only use auc, precision, and recall for binary classification problems.
- Drop remainder from datasets in text_classifier. This helps deal with issues on TPU training that results in NaN loss.
- Disable object detector oss test due to flakiness
MediaPipe Dependencies
- Update WASM files for 0.10.13 release
- Update WASM files to fix issues in the LLM Inference API