Search filters

List of works by Xavier Martorell

A Novel Asynchronous Software Cache Implementation for the Cell-BE Processor

A Proposal for Error Handling in OpenMP

article by Alejandro Duran et al published 28 June 2007 in International Journal of Parallel Programming

A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures

article

A Systematic Methodology to Generate Decomposable and Responsive Power Models for CMPs

ACOTES Project: Advanced Compiler Technologies for Embedded Streaming

AXIOM: A Hardware-Software Platform for Cyber Physical Systems

Accurate energy accounting for shared virtualized environments using PMC-based power modeling techniques

Achieving high memory performance from heterogeneous architectures with the SARC programming model

Adaptive and Speculative Memory Consistency Support for Multi-core Architectures with On-Chip Local Memories

Analysis of Task Offloading for Accelerators

Analyzing Data-Error Propagation Effects in High-Performance Computing

Analyzing the impact of communication imbalance in high-speed networks

Analyzing the impact of programming models for efficient communication overlap in high-speed networks

scholarly article published July 2014

Application/kernel cooperation towards the efficient execution of shared-memory parallel Java codes

Automatic Pre-Fetch and Modulo Scheduling Transformations for the Cell BE Architecture

Automatic Prefetch and Modulo Scheduling Transformations for the Cell BE Architecture

Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP

Boosting irregular array Reductions through In-lined Block-ordering on fast processors

CUDAlign 3.0: Parallel Biological Sequence Comparison in Large GPU Clusters

Characterizing and Improving the Performance of Many-Core Task-Based Parallel Programming Runtimes

Coherence protocol for transparent management of scratchpad memories in shared memory manycore architectures

Complex pipelined executions in OpenMP parallel applications

Counter-Based Power Modeling Methods: Top-Down vs. Bottom-Up

DMA++: On the Fly Data Realignment for On-Chip Memories

article by Nikola Vujic et al published February 2012 in IEEE Transactions on Computers

DMA-circular

Decomposable and responsive power models for multicore processors using performance counters

Design space exploration for aggressive core replication schemes in CMPs

Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling

Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications

article

Enabling high-level parallel programming on multi-FPGA clusters

scientific article published on 17 June 2024

Energy accounting for shared virtualized environments under DVFS using PMC-based power models

Evaluating the Impact of OpenMP 4.0 Extensions on Relevant Parallel Workloads

Evaluating the Performance Impact of Communication Imbalance in Sparse Matrix-Vector Multiplication

scholarly article published March 2015

Evaluation of memory performance on the cell BE with the SARC programming model

article published in 2008

Exploiting Parallelism on GPUs and FPGAs with OmpSs

Exploiting parallelism through directives on the nano-threads programming model

Exploiting pipelined executions in OpenMP

Exploring Memory Error Vulnerability for Parallel Programming Models

Extending OpenMP to Survive the Heterogeneous Multi-Core Era

Fine-grain parallel megabase sequence comparison with multiple heterogeneous GPUs

scientific article (publication date: 2014)

Hardware-software coherence protocol for the coexistence of caches and local memories

Hardware–Software Coherence Protocol for the Coexistence of Caches and Local Memories

article published in 2015

Heterogeneous tasking on SMP/FPGA SoCs: The case of OmpSs and the Zynq

Hybrid access-specific software cache techniques for the cell BE architecture

scholarly article published 2008

Implementing OmpSs support for regions of data in architectures with multiple address spaces

In search of the best MPI-OpenMP distribution for optimum Intel-MIC cluster performance

Leveraging OmpSs to Exploit Hardware Accelerators

Local Memory Design Space Exploration for High-Performance Computing

Makinote: An FPGA-Based HW/SW Platform for Pre-Silicon Emulation of RISC-V Designs

scientific article published on 06 March 2024

Migration of a generic multi-physics framework to HPC environments

NVIDIA GPUs Scalability to Solve Multiple (Batch) Tridiagonal Systems Implementation of cuThomasBatch

NanosCompiler: supporting flexible multilevel parallelism exploitation in OpenMP

Nested Parallelism and Pipelining in OpenMP

OmpSs: A PROPOSAL FOR PROGRAMMING HETEROGENEOUS MULTI-CORE ARCHITECTURES

article by ALEJANDRO DURAN et al published June 2011 in Parallel Processing Letters

OmpSs@Zynq all-programmable SoC ecosystem

On the Instrumentation of OpenMP and OmpSs Tasking Constructs

article

OpenMP Extensions for Thread Groups and Their Run-Time Support

OpenMP extensions for FPGA accelerators

OpenMP tasking analysis for programmers

OpenMP tasks in IBM XL compilers

Optimizing NANOS OpenMP for the IBM Cyclops Multithreaded Architecture

scholarly article

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL

article published in 2011

POTRA

Page Migration with Dynamic Space-Sharing Scheduling Policies: The Case of the SGI O2000

Performance Analysis of Cell Broadband Engine for High Memory Bandwidth Applications

Performance-driven processor allocation

article published in 2005

Poster

Productive Cluster Programming with OmpSs

Productive Programming of GPU Clusters with OmpSs

Reducing data access latency in SDSM systems using runtime optimizations

Resource-Aware Task Scheduling

Running OpenMP applications efficiently on an everything-shared SDSM

Runtime Address Space Computation for SDSM Systems

Runtime-Guided Management of Scratchpad Memories in Multicore Architectures

Support for OpenMP tasks in Nanos v4

Supporting Adaptive Privatization Techniques for Irregular Array Reductions in Task-Parallel Programming Models

Techniques supporting threadprivate in OpenMP

The AXIOM Project: IoT on Heterogeneous Embedded Platforms

scientific article published in 2021

The AXIOM Software Layers

The AXIOM platform for next-generation cyber physical systems

The AXIOM project (Agile, eXtensible, fast I/O Module)

The AXIOM software layers

The Mont-Blanc Prototype: An Alternative Approach for HPC Systems

The Secrets of the Accelerators Unveiled: Tracing Heterogeneous Executions Through OMPT

article

Towards Task-Parallel Reductions in OpenMP

article published in 2015

Transactional Memory and OpenMP

Transient Congestion Avoidance in Software Distributed Shared Memory Systems

Unrolling Loops Containing Task Parallelism

Variable Batched DGEMM

cuHinesBatch: Solving Multiple Hines systems on GPUs Human Brain Project * *This project has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement No 720270 (HBP SGA1), from the Spanish Minist

article