APCCPA '22: Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis

APCCPA '22: Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis

APCCPA '22: Proceedings of the 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis

Full Citation in the ACM Digital Library

SESSION: Session 1: Point Cloud Compression

IPDAE: Improved Patch-Based Deep Autoencoder for Lossy Point Cloud Geometry Compression

  • Kang You
  • Pan Gao
  • Qing Li

Point cloud is a crucial representation of 3D contents, which has been widely used in many areas such as virtual reality, mixed reality, autonomous driving, etc. With the boost of the number of points in the data, how to efficiently compress point cloud becomes a challenging problem. In this paper, we propose a set of significant improvements to patch-based point cloud compression, i.e., a learnable context model for entropy coding, octree coding for sampling centroid points, and an integrated compression and training process. In addition, we propose an adversarial network to improve the uniformity of points during reconstruction. Our experiments show that the improved patch-based autoencoder outperforms the state-of-the-art in terms of rate-distortion performance, on both sparse and large-scale point clouds. More importantly, our method can maintain a short compression time while ensuring the reconstruction quality.

GRASP-Net: Geometric Residual Analysis and Synthesis for Point Cloud Compression

  • Jiahao Pang
  • Muhammad Asad Lodhi
  • Dong Tian

Point cloud compression (PCC) is a key enabler for various 3-D applications, owing to the universality of the point cloud format. Ideally, 3D point clouds endeavor to depict object/scene surfaces that are continuous. Practically, as a set of discrete samples, point clouds are locally disconnected and sparsely distributed. This sparse nature is hindering the discovery of local correlation among points for compression. Motivated by an analysis with fractal dimension, we propose a heterogeneous approach with deep learning for lossy point cloud geometry compression. On top of a base layer compressing a coarse representation of the input, an enhancement layer is designed to cope with the challenging geometric residual/details. Specifically, a point-based network is applied to convert the erratic local details to latent features residing on the coarse point cloud. Then a sparse convolutional neural network operating on the coarse point cloud is launched. It utilizes the continuity/smoothness of the coarse geometry to compress the latent features as an enhancement bit-stream that greatly benefits the reconstruction quality. When this bit-stream is unavailable, e.g., due to packet loss, we support a skip mode with the same architecture which generates geometric details from the coarse point cloud directly. Experimentation on both dense and sparse point clouds demonstrate the state-of-the-art compression performance achieved by our proposal. Our code is available at https://github.com/InterDigitalInc/GRASP-Net.

Wiener Filter-Based Point Cloud Adaptive Denoising for Video-based Point Cloud Compression

  • Jinrui Xing
  • Hui Yuan
  • Chen Chen
  • Tian Guo

We propose a Wiener filter-based point cloud adaptive denoising method for video-based point cloud compression (V-PCC) platform. The proposed Wiener filter is conducted for two dimension (2D) geometry images generated by V-PCC platform. Due to the large local variation of pixel values in the 2D geometry image, a neighborhood differences-based adaptive filtering method is proposed. Specifically, pixels in a 2D geometry image are grouped according to their neighborhood differences and Wiener filter is performed to these categories seperately. In the decoder, Wiener filter will be applied to the distorted images by coefficients and other auxiliary information transmitted from the encoder. Experimental results show that an average -5.4% point-to-point geometry BD-Rate can be achieved by implementing our method on V-PCC, leading to a better subjective quality.

End-to-End Point Cloud Geometry Compression and Analysis with Sparse Tensor

  • Liang Xie
  • Wei Gao
  • Huiming Zheng

With the rapid development of deep learning, encoded objects such as images, videos, and point cloud objects are increasingly used in downstream tasks optimized by deep learning. Traditional coding tools are optimized for human perception, not machine vision. Therefore, we bring forward a point cloud lossy compression method for machine vision, which uses elaborate extracted features to ensure the point cloud classification accuracy. We present a multi-scale channel attention module, which can well integrate features of various channels and dimensions, ensuring the compression performance and integrating the upper-level semantic information well. The experimental results demonstrates that our method achieves 30% BD-Rate gains and 5% improvement in classification compared with PCGCV2 in Modelnet40.

Transformer and Upsampling-Based Point Cloud Compression

  • Junteng Zhang
  • Gexin Liu
  • Dandan Ding
  • Zhan Ma

Learning-based point cloud compression has exhibited superior coding performance over the traditional methods such as MEPG G-PCC. Considering that conventional point cloud representation formats (e.g., octree or voxel) will introduce additional errors and affect the reconstruction quality, we directly use the point-based representation and develop a framework that leverages transformer and upsampling techniques for point cloud compression. To extract latent features that well characterize an input point cloud, we build an end-to-end learning framework: at the encoder side, we leverage cascading transformers to extract and enhance useful features for entropy coding; At the decoder side, in addition to the transformers, an upsampling module utilizing both coordinates and features is devised to reconstruct the point cloud progressively. Experimental results demonstrate that the proposed method achieves the best coding performance against state-of-the-art point-based methods, e.g., >1 dB D1 and D2 PSNR at bitrate 0.10 bpp and more visually pleasing reconstructions. Extensive ablation studies also confirm the effectiveness of transformer and upsampling modules.

View-Adaptive Streaming of Point Cloud Scenes through combined Decomposition and Video-based Coding

  • Michael Rudolph
  • Amr Rizk

Video-based Point Cloud Coding (V-PCC) has emerged as the MPEG standard for the compression of dynamic Point Clouds (PC). Instrumental to its success is that it allows transmitting Point Cloud objects over the Internet using the well established methods for video streaming. A key difference, however, to classical video streaming is that at any point in time only a portion of the PC objects in the scene is visible to the user. Hence, streaming an entire object in one encoding quality wastes bandwidth. In this work, we present Normal-separated Video-based Point Cloud Compression (NoVA-PCC) - a method to combine video-based Point Cloud encoding with an object decomposition based on surface normals. In a nutshell, using NoVA-PCC we obtain non-overlapping, separately encoded view-specific segments that can bestreamed independently. This encoding flexibility is then leveraged to guide the streaming client to select the bit rates of the view-specific segments, e.g. to assign higher qualities to better visible PC object parts. Given a fixed available streaming bandwidth we show that this increases the visual quality compared to compressing PC objects at one overall quality. We also show that this degree of freedom of assigning bit rates to view-specific segments allows better utilization of the available streaming bandwidth when compared with a fixed set of predefined overall object qualities. Finally, in the light of dynamic user motion in multi-object scenes, we show the need for accurate user motion prediction to effectively utilize the streaming bandwidth on the visible PC object parts.

SESSION: Session 2: Point Cloud Processing and Analysis

OpenPointCloud-V2: A Deep Learning Based Open-Source Algorithm Library of Point Cloud Processing

  • Yongchi Zhang
  • Wei Gao
  • Ge Li

Point cloud processing is a very significant research field, as 3D point cloud plays a vital part in visual applications. Considering that there is no work to uniformly arrange the point cloud processing, we build a deep learning based open-source algorithm library of point cloud processing (OpenPointCloud). This paper gives a fundamental overview and analysis of point cloud processing methods, mainly involving five point cloud sampling models, one for point cloud completion, and one for point cloud post-processing. We comprehensively introduce these methods in our algorithm library, and then provide additional implementation of machine learning libraries, such as the mainstream tensorflow, tensorlayer and pytorch. Comprehensive evaluation experiments on some point datasets to systematically compare the performance of all these point cloud processing algorithms in the library.

Quality Evaluation of Machine Learning-based Point Cloud Coding Solutions

  • Joao Prazeres
  • Rafael Rodrigues
  • Manuela Pereira
  • Antonio M.G. Pinheiro

In this paper, a quality evaluation of three point cloud coding solutions based on machine learning technology is presented, notably, ADLPCC, PCC_GEO_CNN, and PCGC, as well as LUT_SR, which uses multi-resolution Look-Up Tables. Moreover, the MPEG G-PCC was used as an anchor. A set of six point clouds, representing both landscapes and objects were coded using the five encoders at different bit rates, and a subjective test, where the distorted and reference point clouds were rotated in a video sequence side by side, is carried out to assess their performance. Furthermore, the performance of point cloud objective quality metrics that usually provide a good representation of the coded content is analyzed against the subjective evaluation results. The obtained results suggest that some of these metrics fail to provide a good representation of the perceived quality, and thus are not suitable to evaluate some distortions created by machine learning-based solutions. A comparison between the analyzed metrics and the type of represented scene or codec is also presented.