Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VTA] [Hardware] Chisel implementation #3258

Merged
merged 41 commits into from
Jun 5, 2019
Merged

[VTA] [Hardware] Chisel implementation #3258

merged 41 commits into from
Jun 5, 2019

Conversation

vegaluisjose
Copy link
Member

This PR provides a Chisel implementation for VTA and it runs on top of TSIM. It runs successfully all the Conv2D layers of ResNet-18 and other unit tests. The testing directory currently is located at tvm/vta/tests/python/tsim. We can modify/merge that later once we figure out how we going to structure everything out.

Let me know any feedback,

}

object ISA {
def LUOP = BitPat("b_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_???????0_0????000")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a better way to describe these instead of hand modifying the bit patterns? i.e a Scala abstraction

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this can be quite overwhelming / error prone to modify and extend. Finding a cleaner way to do this would be nice (e.g. what bits are relevant in the field, and what their values should be).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can discuss this later to come up with a better way to do it. I have an idea but I will like to see your take on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we find a better way to decode? This could get difficult to parse if the instruction width gets wider potentially.

@jroesch
Copy link
Member

jroesch commented May 30, 2019

Okay did two passes over the design, overall looks good to me and mostly makes sense from a mid-level of understanding. My main comments are to provide more high-level explanation about what you are doing, common design patterns, and how things fit together so new people can get up to speed by reading some of the code without needing to ask as many questions. It would be good to bake in some of your design insights, comment on tricky pieces, and explain optimizations if you can.

@vegaluisjose
Copy link
Member Author

[Update] Alright, thanks to the heroic (detailed) review from @jroesch @liangfu @huajsj. I just finished adding the feedback from all of you. I know there are some challenges regarding documentation and how everything works but I think we will add those along the way.

@tqchen
Copy link
Member

tqchen commented Jun 1, 2019

Let us talk about plans about tests/python/tsim/ As we finish the migration. I hope we have a unified VTA test-suite that test both (sim, tsim, and remote if it is available).

The original VTA test infrastructure https://github.com/dmlc/tvm/blob/master/vta/tests/python/unittest/test_vta_insn.py#L74

is built in a way such that testing.run will run the test case for each available environment. See the implementation of run https://github.com/dmlc/tvm/blob/master/vta/python/vta/testing/util.py#L34

I hope we can reuse that instead of creating separate test cases for tsim only

@vegaluisjose
Copy link
Member Author

vegaluisjose commented Jun 1, 2019

This modification makes test_vta_insn.py works for sim, tsim, and remote.

Does this change seems good? @tqchen

@tqchen
Copy link
Member

tqchen commented Jun 3, 2019

@vegaluisjose @jroesch A better way would be to add vta_sim_init to a with statement. Example code. Given this is a context scope.

# enter the hw environment
with vta.Environment(tsim_hw="xyz.so"):
    # code in effect

with vta.Environment(tsim_hw="xyz2_hw.so"):
    # work on xyz2

See related code https://github.com/dmlc/tvm/blob/master/vta/python/vta/environment.py#L182

Once you do that, you no longer need to put sim_init calls, instead, you can just modify vta.testing.run

vta/hardware/chisel/Makefile Show resolved Hide resolved
}

class UopDecode extends Bundle {
val u2 = UInt(10.W)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we parameterize these bits from the uop decode?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The real question here is, are we going to support micro-ops larger than 32-bit? cause otherwise it will be hardcoding the field on another variable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider supporting micro-ops larger than 32 bits

val M_STRIDE_BITS = 16
val M_PAD_BITS = 4

val C_UOP_BGN_BITS = 13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now where are these parameters derived from? (from the code block below)

Copy link
Member Author

@vegaluisjose vegaluisjose Jun 4, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The design pattern here is the following

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also the BitPat "design pattern" was taken from here perhaps for 32-bit and 64-bit machines looks less overwhelming than for 128-bit machines. We definitely can think more about this later on.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My follow up question is: these parameters are just default values, and will be derived from a top-level config file right? Or if I change the size of wgt memory, I'll need to change these bit positions?

}

object ISA {
def LUOP = BitPat("b_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_????????_???????0_0????000")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this can be quite overwhelming / error prone to modify and extend. Finding a cleaner way to do this would be nice (e.g. what bits are relevant in the field, and what their values should be).

@vegaluisjose
Copy link
Member Author

Btw, I don't know why CI is breaking now at the GPU frontend side? @tqchen @jroesch @tmoreau89

@tqchen
Copy link
Member

tqchen commented Jun 4, 2019

Please ignore the GPU error. see some final comments

"""Init hardware library for TSIM"""
cur_path = os.path.dirname(os.path.abspath(os.path.expanduser(__file__)))
vta_build_path = os.path.join(cur_path, "..", "..", "..", "build")
ext = ".dylib" if sys.platform == "darwin" else ".so"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check if extension already exists

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about this?

@@ -55,5 +57,15 @@ def stats():
x = tvm.get_global_func("vta.simulator.profiler_status")()
return json.loads(x)

def tsim_init(hw_lib):
"""Init hardware library for TSIM"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for a user-facing function, always document all arguments.

def tsim_init(hw_lib):
     """Description
    
     Parameters
     ------------
     hw_lib : str
          Path to hardware library
"""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed here.

@tqchen tqchen merged commit 32f74f3 into apache:master Jun 5, 2019
@tqchen
Copy link
Member

tqchen commented Jun 5, 2019

Thanks, @vegaluisjose @tmoreau89 @jroesch @liangfu @huajsj , this PR is now merged

@vegaluisjose vegaluisjose deleted the nVTA-pr branch June 14, 2019 05:42
wweic pushed a commit to wweic/tvm that referenced this pull request Jun 26, 2019
wweic pushed a commit to neo-ai/tvm that referenced this pull request Jun 27, 2019
tqchen pushed a commit to tqchen/tvm that referenced this pull request Mar 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants