DescriptionWhen designing hardware, logic designers often need to implement an algorithm where a reference implementation exists as a software function. High-level Synthesis (HLS) is an approach that converts an algorithm’s C/C++ implementation to a synthesizable hardware description. However, to get a good hardware implementation the developer needs to introduce parallelism and concurrency to the algorithm, while considering the area impacts of each modification. This requires a deep understanding of the algorithm and of the transformations that the HLS compiler performs while generating the hardware implementation from the input code. This can be daunting for even expert designers. Gaining an understanding of the data dependencies and opportunities for parallelism in a software implementation by “staring at the code” is inefficient and impractical. Furthermore, going through an iterative synthesis and performance evaluation flow is extremely time consuming. We introduce an approach that analyzes the input algorithm to determine the characteristics needed for a good HLS implementation. This data is used to create configuration files and code annotations to steer the HLS compiler, without the developer needing an intimate understanding of the reference implementation. An evaluation with a set of representative kernels showed gains between 5-600x, when compared with a sequential HW implementation.