You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently if duplicate columns across dtypes there are issues in getting the correct block given a column name.
I think it is possible, though non-trivial, to instead have a positional map from the frame columns to the BlockManager blocks, will simplify BlockManager.iget.
Primary motivation is to_csv currently cannot handle these types of lookups.
Also should eliminate need for _find_block
In [6]: df = pd.DataFrame(np.random.randn(8,4))
In [12]: df = pd.DataFrame(np.random.randn(8,4))
In [13]: df._data.blocks[0].ref_locs
Out[13]: array([0, 1, 2, 3])
In [14]: df = pd.DataFrame(np.random.randn(8,4),columns=['a']*4)
In [15]: df._data.blocks[0].ref_locs
---------------------------------------------------------------------------
/mnt/home/jreback/pandas/pandas/core/internals.py in ref_locs(self)
52 def ref_locs(self):
53 if self._ref_locs is None:
---> 54 indexer = self.ref_items.get_indexer(self.items)
55 indexer = com._ensure_platform_int(indexer)
56 if (indexer == -1).any():
/mnt/home/jreback/pandas/pandas/core/index.pyc in get_indexer(self, target, method, limit)
835
836 if not self.is_unique:
--> 837 raise Exception('Reindexing only valid with uniquely valued Index '
838 'objects')
839
Exception: Reindexing only valid with uniquely valued Index objects
This is the root of all evil, this should raise the same as above (but doesn't even if
I consolidate)......
In [16]: df = pd.DataFrame(np.random.randn(8,4))
In [17]: df.columns = ['a']*4
In [18]: df._data.blocks[0].ref_locs
Out[18]: array([0, 1, 2, 3])
The text was updated successfully, but these errors were encountered:
Is there actually a reason to support duplicate labels at the block layer, rather
then implementing it as a thin mapping layer on top of unique labels?
see discussion in #3059, #3095, also see #1943, #3102
This only applies with a non-unique column index
Currently if duplicate columns across dtypes there are issues in getting the correct block given a column name.
I think it is possible, though non-trivial, to instead have a positional map from the frame columns to the BlockManager blocks, will simplify BlockManager.iget.
Primary motivation is to_csv currently cannot handle these types of lookups.
Also should eliminate need for _find_block
This is the root of all evil, this should raise the same as above (but doesn't even if
I consolidate)......
The text was updated successfully, but these errors were encountered: