Abstract – Simulated ultrasound data is an important tool for the development and
validation of quantitative image analysis methods in echocardiography.
Unfortunately, simulation time can be prohibitive for large number of scatterers
to be included for scripts. The COLE algorithm by GAO et al is a fast
Convolution-based simulator that performs simulation accuracy for better speed.
We offer GPU implementation of highly customizable CPU and CPU algorithm with
an emphasis on dynamic simulation, which includes moving point scatters. We
argue that it is important to reduce the amount of data transfer from the CPU
to get good performance on the GPU. We receive this as the spline curve in the
GPU memory as storage of complete trajectories of this dynamic point scatterers.
It leads to good efficiency, when large card frames, such as B-mode and tissue
Doppler data, index for the whole cardiac cycle. Apart from this, we propose a
phase-based subsample delay technique that efficiently eliminates the fickle
artifacts visible in B-mode scenes, when COLE is used without adequate
temporary oversampling. In order to assess the performance, we used a laptop
computer and a desktop computer, each with a multicore Intel CPU and an NVIDIA
GPU. Run the simulator on a high-end Titan X GPU, we saw two commands of
magnitude speedup compared to the parallel CPU version, compared to the time of
simulation performed by Gao et al in three orders of magnitude in his paper on
Cole, and 27,000 times faster than the multithreaded version of Field Two, using
the numbers given in a letter by Jensen. We hope that by releasing the
simulator as an open-source project, we will use it and encourage further

Keywords – Simulation,
Ultrasonic imaging.

I.     Introduction

       Simulated ultrasound data
have range of applications. Automatic segmentation algorithms used for
quantitative image analysis, have number of free parameters that must be tuned
in order to achieve maximum performance for the specific applications. Fast
simulation on ultrasound image is not only have importance for educational
purpose but also for validation and standardization of existing techniques [1].

simulations are important in the heart and pulse imaging, for B-mode and
Doppler imaging. While simulating Color Doppler or M-Mode Scanning, goal from
simulated beam to beam will vary slightly, based on a motion model.

       There are several ways to emulate
ultrasound class of simulators, usually designed for training purposes, calculation
of segments is based on the use of tomography recording for anatomy and ray
tracing to simulate wave propagation. Reporting in Effective Results [2] – [4],
both regarding image quality and simulation time. However, while being able to
influence the model such as reversible and shading, rock pattern often not physically
accurate enough, e.g., for Doppler Simulation.

       Ultrasound is based on a third large
class of simulators standard tool in the collection of point scatterers as a
target ultrasound community is Field II [6], [7]. This program improves linear acoustics
using correct spatial impulses reaction method, but simulation times are often

       In this paper, the whole emulation will
be emphasized ultrasound image structure is a widely used method and rapid
technique is a conceptual model, where the collection of point scatterers is
convolved with an emitted pulse to give simulated image. COLE algorithm [8] the basis for our work, uses a 1-D
conversion for each RF line for every such line to be simulated, all scatterers
are estimated is associated with the resulting projection signal pulse waveform
to produce simulated RF signals. Similarly, Marion and Vray [1] to the project scatterers
on 2-D aircraft after 2-D compilation to simulate an image plane.

       The attractive qualities of using 1-D
contraction are arbitrary scanning geometry are easily supported, as well as a
dynamic simulation, such as an M-mode or color Doppler, where the dentist
constantly changes his position transmitted incidents.

       Using GPUs can give an important
performance growth, but they come with their own set of challenges, most
especially the issue of data copying all the memory transfer should go between
the CPU and the GPU at present peripheral Component Interconnect (PCI) Express
bus, which is usually limited to less than 10 Gbytes/s. A business edition
regional II is optimized with multithread and related memory allocation has
been released in [1], and speeds reported to be between 13 to 30 free,
Single-Threaded Version of Field II.