I’m a chip designer working on the digital side. I’ve got experience with
- CPU/SoC architecture and design, especially RISC-V open ISA
- IC design/verification with Verilog/SystemVerilog/SystemC
- Low power design and optimization
- ASIC design flow, including front-end, back-end and power sign-off
- Semi-custom design flow, including transistor timing analysis and SPICE simulation
Currently my interests are
- Harware and software co-design
- SoC generator
- Machine learning accelerator
If you share the same interest and want a discussion, please send me a message on LinkedIn
Work expereince
Vector regfile 32 of them, v0 to v31 Each is VLEN bits Each can be divided into several elements The max element width is ELEN CSR vsew maps to SEW (standard element width) controls their width dynamically CSR vl controls the number of elements to operate on for vector instructions Packing of shorter vector when SEW is smaller than ELEN, multiple SEW will be packed into one ELEN unit Following little-endian rule ELEN units are packed into VLEN register also Following little-endiam rule Storage of longer vector If operand longer than SEW is needed, then Even-numbered vector register holds the even-numbered elements Odd-numbered vector register holds the odd-numberred elements WHY?
// Some simple example #include <systemc.h> SC_MODULE (seq_and2 ) { // sequential AND2 sc_in< sc_uint<8> > a; sc_in< sc_unit<8> > b; sc_out< sc_uint<8> > f; sc_in<bool> clk; void func() { f.write( a.read() & b.read() ); } SC_CTOR ( seq_and2 ) { SC_CTHREAD(func); sensitive << clk.neg(); } } Port & signal Port sc_in & sc_out .read() & .write() functions Signal sc_signal Threads SC_METHOD() Just like always_comb in Verilog, but you have to list the sensitive list SC_THREAD() Not commonly used Behavior like initial in Verilog SC_CTHREAD(function name, clock sensitive) Most commonly used Only sensitive to clock edge, just like always_ff in Verilog Not limited to one cycle sensitive keyword to define the sensitive list Datatypes Integers
My notes on RISC-V Summit 2018 at Santa Clara Conventional Center This year’s summit has many more participants than the last one, which means RISC-V is getting a lot of momentum around the world. Although most of the speeches are technology-detail-less propaganda thing, we still can find something useful out of it. And more importantly, talking to the engineers manning the booth is very interesting and information rich.
SiFive’s biz model Help customer to tape-out prototypes, and sell chips back to the customer.
Ariane Document
Architecture note PC gen stage The fetching address for i-cache is always word-aligned. Fetch stage Its fetch stage doesn’t have much decoding work to do, only the necessary one to generate next PC. And it relies on its branch prediction to give out next PC.
There is an internal FIFO with 2 entries to log the PC (and other meta-info) that was sent to i-cache, while waiting for its response.
NoC Clustering coefficient: the most intuitive explanation is the number of hops between two random nodes in the network. Layers Physical layer Link layer Transaction protocol: such as AXI Seperated channels like AXI, to avoid dead-lock caused by depency problem Transport layer Packet: header & payload Flow control On link-level or end-to-end level Link-level: every hop there is a notification, just like valid-ready protocol End-to-end level: every transaction of data from sender to receiver has to have some kind of notification that the receiver notify the sender it has received the data successfully.
A piece of very precious memory
Home –> LEGOLAND 264 miles, 4 hours non-stop (with stop, 7:00AM to 12:00AM)
Lunch at Wendy’s: 5821 Dennis McCarthy Dr, Lebec, CA 93243
Wendy’s is just another burger place. We ended up with Panda Express at the same rest area. 174 miles, 3 hours non-stop (with stop, 1:30PM to 4:30PM)
LEGOLAND Castle Hotel
Coherence mechanism Snooping Every cache maintain its own cache state. And when it needs to access a shared address space, it sends snooping messages to all the other caches to either update or invalidate them.
Write invalidate: write operation will invalidate all the other shared copies. Others will have to read again from the next level of cache to use it again. Write update: write operation will give the written data to the shared copies and update them accordingly.
Register renaming To eliminate the false and output data dependency by adding extra physical registers more than architectural registers.
Read-after-write (RAW) is true data dependency Write-after-write (WAW) is output data dependency Write-after-read (WAR) is false data dependency Superscalar Dynamically issue multiple instructions in each cycle to increase IPC.
Normally need multi-port register files and ALU to avoid structural hazard. Can be in-order or out-of-order Re-order buffer For out-of-order execution CPU architecture, results are put into re-order buffer waiting for commit.
1. 晚上睡前戴眼镜片 在使用“阿托品”之后至少两小时,以保证药物被吸收。 洗手,并且用厨房纸(不掉纸屑)擦干。 滴眼药水,润湿眼球。 灰色眼镜片是右眼的,GRAY has an R for RIGHT;蓝色眼镜片是左眼的,BLUE has an L for LEFT。 将眼药水注满眼镜片内部;低头,拉开上下眼帘;将眼镜慢慢地水平地放入。 确认眼镜片在眼球正中,否则可以闭眼后,用手指在眼帘外轻推调整。 确认没有大的可见气泡,否则会影响治疗效果。如果有小气泡,可以仰头滴入数滴眼药水,并闭眼转动眼球,将气泡挤出。可重复几次。如果没有明显效果,用洗盘将眼镜片取出,并重新放置。 确认两个眼镜片都放置成功以后,再分别滴一滴眼药水。 如果眼睛痒,可以轻轻按压,但切忌按揉。 如果镜片掉落在桌上或者地上,可以用BLINK药水清洗,同时用手指轻轻搓揉。 2. 早上取眼镜片 各滴一滴眼药水。 用洗盘将镜片取出。注意尽量吸取镜片下边缘,而不要吸取镜片中心,因为镜片中心更薄,容易损坏。如果感觉洗盘吸力不够,可以用厨房纸吸干。 分离洗盘和镜片的时候不要垂直方向硬拉,而要轻轻滑出来。 取完之后滴一滴眼药水。 之后将眼镜片放入ONE-SHOT存储盒,注意分清左右。 将ONE-SHOT药水倒入存储盒,到划线处。放置至少6个小时。如果前一晚未使用,也需要更换ONE-SHOT药水,并放置至少6个小时。 ONE-SHOT药水需要反应6小时之后才会无害,否则会刺激眼睛。它的主要成分是双氧水。
When using a counter to divide a clock, don’t reset the counter, especially when you are using synchronous reset. It will make the clock quiet while reset. And if it’s used along with sync reset, then those flip-flop won’t be reset at all.
But if without reset, the counter will be “X” in simulation.
logic [1:0] cntr; `ifndef SYNTHESIS initial begin cntr = 2'b00; end `endif always_ff @ (posedge clk) begin cntr <= cntr + 1; end