Email: Office: |
wwlfung(at)ece(dot)ubc(dot)ca KAIS 4075 |
Being part of the computer architecture research lab, I am interested in anything that can improve computing power via architecture (so to enable new applications for computers). There are just too many interesting problems in computer architecture waiting to be solved. Here is a list of topics that interest me (the most!):
Education
Master of Applied Science in Computer Engineering, The University of British Columbia - Completed October 2008.
Bachelor of Applied Science in Computer Engineering, The University of British Columbia - Completed August 2006.
Recent Employment
May 2009 - August 2009: Architecture Intern, NVIDIA Co., Santa Clara
May 2008 - August 2008: Architecture Intern, NVIDIA Co., Santa Clara
June 2004 - May 2005: Researcher Internship, SANYO Electric Co. Ltd., Tokyo
January 2004 - April 2004: Test Engineer, PMC Sierra Inc., Vancouver
September 2002 - April 2003: Software Developer, UBC Department of Electrical and Computer Engineering, Vancouver
Wilson W. L. Fung, Tor M. Aamodt.
Energy Efficient GPU Transactional Memory via Space-Time Optimizations.
46th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-46), 2013.
[PowerPoint slides]
Hadi Jooybar, Wilson W. L. Fung, Mike O'Connor, Joseph Devietti, Tor M. Aamodt. GPUDet: A Deterministic GPU Architecture. 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS 2013), 2013.
Inderpreet Singh, Arrvindh Shriraman, Wilson W. L. Fung, Mike O'Connor, Tor M. Aamodt. Cache Coherence for GPU Architectures. 19th IEEE International Symposium on High-Performance Computer Architecture (HPCA-19), 2013. Selected for IEEE Micro Top Picks
Wilson W. L. Fung, Inderpreet Singh, Andrew Brownsword, Tor M. Aamodt.
Kilo TM: Hardware Transactional Memory for GPU Architectures.
IEEE Micro, Special Issue: Micro's Top Picks from 2011 Computer Architecture Conferences. Volume 32, No. 3, pp. 7-16, May/June 2012.
Wilson W. L. Fung, Inderpreet Singh, Tor M. Aamodt. Kilo TM Correctness: ABA Tolerance and Validation-Commit Indivisibility. Technical Report, University of British Columbia, 24 May 2012.
Wilson W. L. Fung, Inderpreet Singh, Andrew Brownsword, Tor M. Aamodt.
Hardware Transactional Memory for GPU Architectures.
44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO-44), 2011. Selected for IEEE Micro Top Picks
Wilson W. L. Fung, Tor M. Aamodt.
Thread Block Compaction for Efficient SIMT Control Flow.
17th IEEE International Symposium on High-Performance Computer Architecture (HPCA-17), 2011.
[slides]
Aaron Arial, Wilson W. L. Fung, Andrew M. Turner, Tor M. Aamodt.
Visualizing Complex Dynamics in Many-Core Accelerator Architectures.
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2010): 164-174
Wilson W. L. Fung, Ivan Sham, George Yuan, Tor M. Aamodt.
Dynamic Warp Formation: Efficient MIMD Control Flow on SIMD Graphics Hardware.
ACM Transactions on Architecture and Code Optimization (TACO). Volume 6, Issue 2, Article 7. (June 2009), 1-37.
Ali Bakhoda, George L. Yuan, Wilson W. L. Fung, Henry Wong, Tor M. Aamodt.
Analyzing CUDA Workloads Using a Detailed GPU Simulator.
IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS 2009): 163-174
Wilson W. L. Fung, Ivan Sham, George Yuan, Tor M. Aamodt.
Dynamic Warp Formation and Scheduling for Efficient GPU Control Flow.
40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007): 407-420
Wilson Wai Lun Fung, Akiomi Kunisa.
Rotation, Scaling, and Translation-Invariant Multi-Bit Watermarking based on Log-Polar Mapping and Discrete Fourier Transform. ICME 2005
Robert Rohling, Wilson Fung, Pedram Lajevardi.
PUPIL: Programmable Ultrasound Platform and Interface Library. MICCAI (2) 2003: 424-431
<pdf>
color-make download
This is a wrapper around "make" to add color to its output for easier
debugging. Why this is a better solution than color-gcc? This script will run
anywhere with a standard Perl installation with zero setup (i.e. you do not
need root access to install it!).
perlSPICE download
This is a handy (and SIMPLE!) script that I wrote when I was in a course doing SPICE simulation. While I am sure that
there are tools allowing you to generate circuits automatically, nothing beats the flexibility to use perl code
inside a netlist file to generate some circuit using a mini program.
To run the code,
>./perlspice.pl <spice netlist file containing perl code>
To prevent confusion with the generated spice netlist, I decided to let the script to only work on input files with
extension .spl. Further usage of the code can be found inside the script. Be sure to check if the path to spice is correct.
This is a optimization tip discovered by Henry Wong:
There is a maximum operator (>?=) in GNU's gcc. It is much faster than the normal if-else statement because it got optimized into cmov (rather than jne) in x86. Too bad it is depreciated as it is not part of ANSI-C standard, but then here is a form that meets the standard and is translated to the same code.
Just replace a >?= b; with:
a = (a >= b) ? a : b;
and to avoid the extra typing, just define a macro :-)
#define MAX(x, y) (((x) >= (y))? (x) : (y))
Update: The problem with this macro is that if the input (x or y) comes with an increment/decrement operator (++ or --), then this macro will screw up. In the end, one should just know what the compiler (and the preprocessor) is doing...