-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
implement Arrow2's odbc reader and writers #2994
Comments
Preliminary version I came up with, This only works with string columns (UTF8 implemented only). use arrow2::array::Utf8Array;
use arrow2::error::Result;
use arrow2::io::odbc::api::Cursor;
use arrow2::io::odbc::{api, read};
use polars::prelude::*;
const QUERY: &str = include_str!("../query.sql");
fn main() -> Result<()> {
let connector = "ODBC_STRING";
let env = api::Environment::new()?;
let connection = env.connect_with_connection_string(connector)?;
let mut prep = connection.prepare(QUERY)?;
let fields = read::infer_schema(&prep)?;
let mut df = fields
.iter()
.map(|s| match s.data_type {
ArrowDataType::Utf8 => Series::new_empty(&s.name, &DataType::Utf8),
_ => unimplemented!(),
})
.collect::<Vec<_>>();
let max_batch_size = 100;
let buffer = read::buffer_from_metadata(&prep, max_batch_size)?;
let cursor = prep.execute(())?.unwrap();
let mut cursor = cursor.bind_buffer(buffer)?;
while let Some(batch) = cursor.fetch()? {
for ((idx, field), df_elem) in (0..batch.num_cols()).zip(fields.iter()).zip(df.iter_mut()) {
let column_view = batch.column(idx);
let arr = Arc::from(read::deserialize(column_view, field.data_type.clone()));
let series = Series::try_from((field.name.as_str(), vec![arr])).unwrap();
df_elem.append(&series).unwrap();
}
}
let dataframe = DataFrame::new(df).unwrap();
dbg!(dataframe);
Ok(())
} We need to ideally utilize this function, although it works on chunks together, not individual one. Edit: |
I'm tempting to work on this. Will draft the PR on the weekend. |
I got a version that is working here (can infer schema) here |
I see this is still open. Is there interest in this? |
hey, I have a similar requirement. |
Hi, I found the arrow RecordBatch to DataFrame code hiere: |
We now have native ODBC support upstream. This has to be exposed in polars similarly to existing IO readers and writers.
The text was updated successfully, but these errors were encountered: