Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Snippets] Moved infrastructure to Linear Intermediate Representation #16402

Merged
Show file tree
Hide file tree
Changes from 24 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
1377883
Introduce linear IR and disable obsolete tests
IvanNovoselov Jan 4, 2023
d5f8fb5
[Snippets] Added Loop markup and Loop Fusion on Linear IR Level
a-sidorova Mar 17, 2023
b675bef
Added support of custom Plugin ops in Linear IR
a-sidorova Mar 29, 2023
feb7bfc
[Snippets] Added Buffer identification
a-sidorova Mar 30, 2023
7d4ce5c
[Snippets] Refactoring
a-sidorova Apr 17, 2023
1fada21
Fixes after rebasing
a-sidorova Apr 17, 2023
c530927
Removed work around for StoreEmitter
a-sidorova Apr 17, 2023
467c7aa
[Snippets] Refactoring of transformations
a-sidorova Apr 17, 2023
508a34b
[Snippets] Rebased on the latest master
a-sidorova Apr 19, 2023
7440994
[Snippets] Added support of Port Descriptor (#106)
a-sidorova May 11, 2023
43936a3
Applied comments by Ivan #1
a-sidorova May 11, 2023
e7ee0d5
Fixed Loads with the same Parent: CleanRepeatedPtrShifts
a-sidorova May 11, 2023
5ef7227
Updated Buffer Identification logic
a-sidorova May 11, 2023
2ea1bf3
Cleaned cmake lists
a-sidorova May 11, 2023
9b45cfb
fixes after rebase
a-sidorova May 11, 2023
979b673
fixed lin build
a-sidorova May 11, 2023
ef6717e
fixed build 2
a-sidorova May 11, 2023
dd0a4e1
added missed file
a-sidorova May 11, 2023
f5d59ce
fixed snippets test build
a-sidorova May 12, 2023
1eb736a
Applied comments by Ivan #2
a-sidorova May 12, 2023
89f99e5
[Snippets] Moved reg_info from Expression to PortDescriptor
a-sidorova May 15, 2023
f71b552
Moved Linear IR transformations from generator to Subgraph
a-sidorova May 16, 2023
ec5920b
Fixed InsertStore for Buffer wo inputs
a-sidorova May 17, 2023
14b8709
Removed incorrect extra copy rt_info which break PortDescriptors
a-sidorova May 17, 2023
13d956f
[Snippets] Moved namespace from ngraph to ov
a-sidorova May 18, 2023
d81287e
Applied comments by Dmitry
a-sidorova May 19, 2023
dbfe69a
[Snippets] Tensor -> PortConnector
a-sidorova May 19, 2023
0e04ae1
[Snippets] Added link to doc
a-sidorova May 19, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions src/common/snippets/include/snippets/emitter.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,11 +24,9 @@ class Emitter {
/**
* @brief Default constructor
*/
Emitter(const std::shared_ptr<ngraph::Node>& n) {
}
Emitter(const std::shared_ptr<ngraph::Node>& n) {}

Emitter(std::vector<std::pair<std::shared_ptr<Emitter>, RegInfo>>& region) {
}
Emitter(std::vector<std::pair<std::shared_ptr<Emitter>, RegInfo>>& region) {}

/**
* @brief called by generator to generate code to produce target code for a specific operation
Expand All @@ -47,8 +45,8 @@ class Emitter {
* @brief called by generator to generate data section, if needed for a specific operation
* @return void
*/
virtual void emit_data() const {
}
virtual void emit_data() const {}

virtual ~Emitter() = default;
};

Expand Down
90 changes: 10 additions & 80 deletions src/common/snippets/include/snippets/generator.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -9,74 +9,13 @@
#pragma once

#include "snippets_isa.hpp"
#include "emitter.hpp"

#include "snippets/lowered/linear_ir.hpp"
#include "snippets/lowered/pass/pass.hpp"

namespace ngraph {
namespace snippets {

auto getRegisters(std::shared_ptr<ngraph::Node>& n) -> ngraph::snippets::RegInfo;

typedef std::pair<std::function<std::shared_ptr<Emitter>(const std::shared_ptr<ngraph::Node>&)>,
std::function<std::set<std::vector<element::Type>>(const std::shared_ptr<ngraph::Node>&)>> jitters_value;
/**
* @interface TargetMachine
* @brief Base class Target machine representation. Target derives from this class to provide generator information about supported emitters
* @ingroup snippets
*/
class TargetMachine {
public:
/**
* @brief checks if target is natively supported
* @return true, if supported
*/
virtual bool is_supported() const = 0;

/**
* @brief finalizes code generation
* @return generated kernel binary
*/
virtual code get_snippet() const = 0;

/**
* @brief gets number of lanes supported by target's vector ISA
* @return number of lanes
*/
virtual size_t get_lanes() const = 0;

/**
* @brief called by generator to all the emitter for a target machine
* @return a map by node's type info with callbacks to create an instance of emitter for corresponding operation type
*/
std::function<std::shared_ptr<Emitter>(std::shared_ptr<ngraph::Node>)> get(const ngraph::DiscreteTypeInfo type) const {
auto jitter = jitters.find(type);
if (jitter == jitters.end()) {
OPENVINO_THROW(std::string("Target code emitter is not available for ") + type.name + " operation.");
}
return jitter->second.first;
}

std::function<std::set<std::vector<element::Type>>(const std::shared_ptr<ngraph::Node>&)>
get_supported_precisions(const ngraph::DiscreteTypeInfo type) const {
auto jitter = jitters.find(type);
if (jitter == jitters.end()) {
OPENVINO_THROW(std::string("Target code emitter is not available for ") + type.name + " operation.");
}
return jitter->second.second;
}

/**
* @brief checks if emitter for a specific operation is supported
* @return true, if supported
*/
bool has(const ngraph::DiscreteTypeInfo type) const {
return jitters.find(type) != jitters.end();
}
virtual ~TargetMachine() = default;

protected:
std::map<const ngraph::DiscreteTypeInfo, jitters_value> jitters;
};

/**
* @interface Schedule
* @brief Return scheduling information and pointer to generated kernel code
Expand Down Expand Up @@ -117,7 +56,7 @@ class Generator {
/**
* @brief Default constructor
*/
Generator(const std::shared_ptr<TargetMachine>& t) : target(t) {}
Generator(const std::shared_ptr<TargetMachine>& t) : target(t), lowered_saved{} {}
/**
* @brief Default destructor
*/
Expand All @@ -126,27 +65,18 @@ class Generator {
* @interface GeneratorConfig
* @brief Allows to tweak the lowering process.
*/
class GeneratorConfig {
public:
// True if the lowered Emitters need to be accessed during runtime. Normally they're destroyed after code emission.
bool m_save_lowered_code = false;
// True if we can optimize tails for single evaluation during code generation
// More details with optimization examples you can see in generate() method
// For example, tails with Buffer ops doesn't support single evaluation optimizations
// because of that we should always reset memory pointer using finalization offsets
// after data storing to Buffer
bool m_optimize_single_evaluation = true;
// True if we should check runtime info for nodes to call specific needed transformations
bool m_need_fill_tail_register = false;
};
/**
* @brief virtual method any specific implementation should implement
* @param m model in canonical for for table-based code generation
* @param config config with transformation and optimization parameters
* @param compile_params parameters for generated code
* @return pointer to generated code
*/
code generate(std::shared_ptr<ov::Model>& m, const GeneratorConfig& config, const void* compile_params = nullptr);
struct LoweringResult {
LoweringResult(code c) : binary_code(c) {}
code binary_code = nullptr;
};
LoweringResult generate(lowered::LinearIR& linear_ir, const lowered::Config& config, const void* compile_params = nullptr);

/**
* @brief gets target machine
Expand Down Expand Up @@ -180,7 +110,7 @@ class Generator {
std::shared_ptr<TargetMachine> target;
// todo: we need to save lowered code to access compiled brgemm kernels on execution time (normally lowered is destructed by then).
// This is temporary solution, remove this when kernel caching is implemented. Don't forget to make generate const method.
std::vector<AllocatedEmitter> lowered_saved;
lowered::LinearIR lowered_saved;
};

} // namespace snippets
Expand Down
99 changes: 99 additions & 0 deletions src/common/snippets/include/snippets/lowered/expression.hpp
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Folders structure looks unaligned a little bit. Maybe we can do smt like:

snippets/.../...
snippets/.../linear_ir/...
snippets/.../linear_ir/pass/...
snippets/.../linear_ir/op/...
snippets/.../ngraph/...
snippets/.../ngraph/pass/...
snippets/.../ngraph/op/...

Also bot sure about lowered naming. Maybe linear_ir fits better, but @IvanNovoselov this is up-to you to decide.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lowering is a rather vague term, so I like the idea to distinguish the workflow steps based on the IR's they use.
In fact, I'm already using this distinction in the documentation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets do in separate PR then

Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
// Copyright (C) 2023 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#pragma once

#include <openvino/core/node.hpp>
#include <openvino/opsets/opset1.hpp>

#include "snippets/emitter.hpp"
#include "snippets/target_machine.hpp"
#include "snippets/lowered/tensor.hpp"
#include "snippets/lowered/expression_port.hpp"


namespace ngraph {
namespace snippets {
namespace lowered {

class LinearIR;

class Expression : public std::enable_shared_from_this<Expression> {
friend class LinearIR;
friend class ExpressionPort;

public:
static size_t LOOP_NULL_ID;

Expression() = default;
virtual ~Expression() = default;

std::shared_ptr<Node> get_node() const;
std::shared_ptr<Emitter> get_emitter() const;

RegInfo get_reg_info() const;
void set_reg_info(RegInfo rinfo);

const TensorPtr& get_input_tensor(size_t i) const;
const TensorPtr& get_output_tensor(size_t i) const;
std::vector<TensorPtr> get_input_tensors() const { return m_input_tensors; }
std::vector<TensorPtr> get_output_tensors() const { return m_output_tensors; }

const PortDescriptorPtr& get_input_port_descriptor(size_t i) const;
const PortDescriptorPtr& get_output_port_descriptor(size_t i) const;
std::vector<PortDescriptorPtr> get_input_port_descriptors() const { return m_input_port_descriptors; }
std::vector<PortDescriptorPtr> get_output_port_descriptors() const { return m_output_port_descriptors; }

size_t get_input_count() const { return m_input_tensors.size(); }
size_t get_output_count() const { return m_output_tensors.size(); }

std::vector<size_t> get_loop_ids() const { return m_loop_ids; }
void set_loop_ids(const std::vector<size_t>& loops) { m_loop_ids = loops; }
void set_loop_id(size_t id, size_t idx);
void remove_loop_id(size_t id);

void validate() const;
void init_emitter(const std::shared_ptr<const TargetMachine>& target);

ExpressionPort get_input_port(size_t i);
ExpressionPort get_output_port(size_t i);

protected:
// Note: The constructor and tensor initialization are private since an expression can be created only by Linear IR.
// These methods must be used only by Linear IR builder of expressions!
explicit Expression(const std::shared_ptr<Node>& n);

void replace_input(size_t port, TensorPtr to);

std::shared_ptr<Node> m_source_node{nullptr};
std::shared_ptr<Emitter> m_emitter{nullptr};
std::vector<TensorPtr> m_input_tensors{};
std::vector<TensorPtr> m_output_tensors{};
std::vector<PortDescriptorPtr> m_input_port_descriptors{};
std::vector<PortDescriptorPtr> m_output_port_descriptors{};
// The order Loops identifies: Outer ---> Inner
std::vector<size_t> m_loop_ids;
};
using ExpressionPtr = std::shared_ptr<Expression>;

class IOExpression : public Expression {
friend class LinearIR;

public:
enum class io_type {INPUT, OUTPUT, UNDEFINED};

int64_t get_index() const { return m_index; }
io_type get_type() const { return m_type; }

private:
explicit IOExpression(const std::shared_ptr<ov::opset1::Parameter>& n, int64_t index);
explicit IOExpression(const std::shared_ptr<ov::opset1::Result>& n, int64_t index);

int64_t m_index = -1;
io_type m_type = io_type::UNDEFINED;
};

} // namespace lowered
} // namespace snippets
} // namespace ngraph
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
// Copyright (C) 2023 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#pragma once

#include "linear_ir.hpp"

#include "snippets/snippets_isa.hpp"

namespace ngraph {
namespace snippets {
namespace lowered {

class LinearIR::ExpressionFactory {
public:
template<class... Args>
static ExpressionPtr build(const std::shared_ptr<Node>& n, Args&&... params) {
if (const auto par = ov::as_type_ptr<ov::op::v0::Parameter>(n)) {
return create(par, params...);
} else if (const auto res = ov::as_type_ptr<ov::op::v0::Result>(n)) {
return create(res, params...);
} else if (const auto loop_begin = ov::as_type_ptr<op::LoopBegin>(n)) {
return create(loop_begin, params...);
} else if (const auto loop_end = ov::as_type_ptr<op::LoopEnd>(n)) {
return create(loop_end, params...);
}
return create(n, params...);
}

private:
/* -- Default Builders - initialize input tensors from parents and create new output tensors themselves */
static ExpressionPtr create(const std::shared_ptr<ngraph::op::v0::Parameter>& par, const LinearIR& linear_ir,
const std::shared_ptr<ov::Model>& model);
static ExpressionPtr create(const std::shared_ptr<ngraph::op::v0::Result>& res, const LinearIR& linear_ir,
const std::shared_ptr<ov::Model>& model);
static ExpressionPtr create(const std::shared_ptr<ov::Node>& n, const LinearIR& linear_ir,
const std::shared_ptr<ov::Model>& model);

/* -- Input Builders - get input tensors from method parameters and create new output tensors themselves */
static ExpressionPtr create(const std::shared_ptr<op::LoopBegin>& n, const std::vector<TensorPtr>& inputs);
static ExpressionPtr create(const std::shared_ptr<op::LoopEnd>& n, const std::vector<TensorPtr>& inputs);
static ExpressionPtr create(const std::shared_ptr<ov::Node>& n, const std::vector<TensorPtr>& inputs);

// Creates inputs for expression using parent output tensors
static void create_expression_inputs(const LinearIR& linear_ir, const ExpressionPtr& expr);
// Creates new output tensors
static void create_expression_outputs(const ExpressionPtr& expr);
// The method verifies of input tensors to availability of the expression as consumer and add it if missed
static void init_expression_inputs(const ExpressionPtr& expr, const std::vector<TensorPtr>& inputs);
};

} // namespace lowered
} // namespace snippets
} // namespace ngraph
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
// Copyright (C) 2023 Intel Corporation
// SPDX-License-Identifier: Apache-2.0
//

#pragma once

#include <memory>
#include <vector>

#include "port_descriptor.hpp"


namespace ngraph {
namespace snippets {
namespace lowered {

class Tensor;
class Expression;
class ExpressionPort {
public:
enum Type {
Input,
Output
};

ExpressionPort() = default;
explicit ExpressionPort(const std::shared_ptr<Expression>& expr, Type type, size_t port);

const std::shared_ptr<Expression>& get_expr() const { return m_expr; }
Type get_type() const { return m_type; }
size_t get_index() const { return m_port_index; }

const PortDescriptorPtr& get_descriptor_ptr() const;
const std::shared_ptr<Tensor>& get_tensor_ptr() const;
// Returns connected ports to the current:
// - Input port returns one source (parent) port
// - Output port returns all consumer ports (children)
std::set<ExpressionPort> get_connected_ports() const;

friend bool operator==(const ExpressionPort& lhs, const ExpressionPort& rhs);
friend bool operator!=(const ExpressionPort& lhs, const ExpressionPort& rhs);
friend bool operator<(const ExpressionPort& lhs, const ExpressionPort& rhs);

private:
std::shared_ptr<Expression> m_expr;
Type m_type = Type::Output;
size_t m_port_index = 0;
};
} // namespace lowered
} // namespace snippets
} // namespace ngraph
Loading