Python Forum
FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries
#11
I'm not giving up on a vectored solution quite yet, but a solution that is better than using concat is to use your logic to compute a "late_prints" column. You still pay a looping time penalty, but you avoid a larger penalty for concatenating frames a row at a time.
Reply
#12
Thanks, yeah that's a good point using concat instead of _append. If you can think of a vector solution, I'd love that. The fact that I can have an unknown number of rows that are "late prints" is what is really throwing me off from a vector solution here...
Reply
#13
Quote: that's a good point using concat instead
That not what I was trying to say. concat is better than append, but both should be used sparingly. What I was trying to say is that I would use boolean indexing to make the new dataframe, and I would use your late prints identifier to create the boolean list. Maybe I could vectorize some of that process.

I would probably start with a shift of price and time. Now I can compute a change rate (price - shifted_price) / (time - shifted_time). If I see a rapid change, I start marking data rows as suspect. I stop suspecting the data when I see a shift in the opposite direction.
Reply
#14
(Apr-22-2024, 03:00 PM)sawtooth500 Wrote: This provides more info on late prints if you are curious.

https://www.youtube.com/watch?v=OZrMMOHiUeo

Although off topic, but a very useful video
Reply
#15
(Apr-22-2024, 09:51 PM)deanhystad Wrote:
Quote: that's a good point using concat instead
That not what I was trying to say. concat is better than append, but both should be used sparingly. What I was trying to say is that I would use boolean indexing to make the new dataframe, and I would use your late prints identifier to create the boolean list. Maybe I could vectorize some of that process.

I would probably start with a shift of price and time. Now I can compute a change rate (price - shifted_price) / (time - shifted_time). If I see a rapid change, I start marking data rows as suspect. I stop suspecting the data when I see a shift in the opposite direction.

That's more or less the approach I'm taking. Finding the start of late prints is very easy actually - I just shift the dataframe by one row, and if it's beyond a certain pricedelta that's the start. The trick is finding the end of it. Consider the following example price action, let's set out pricedelta to be 1.00

180.01
180.01
180.02
180.02
180.03
180.01
181.32 LATE PRINT
181.31 LATE PRINT
181.32 LATE PRINT
180.08
180.08
180.09
180.10
180.19

I feel like once I ID the start of a late sequence, I need to use a for loop because I don't know how many there will be before it goes back to "normal" - there are 3 late in this case, but I've seen as many as 13 late in a row, but again 13 should not be considered an upper limit. If there is no upper limit for a the number of bad prints in a row, I don't know how to use a shift to find end since that requires knowing by how many rows to shift.
Reply


Possibly Related Threads…
Thread Author Replies Views Last Post
  FutureWarning: The behavior of DataFrame concatenation with empty or all-NA entries i sawtooth500 3 1,427 Mar-22-2024, 03:08 AM
Last Post: deanhystad
  String concatenation in SQL update statement hammer 3 1,567 Feb-24-2022, 08:00 PM
Last Post: hammer
  f string concatenation problem growSeb 3 2,293 Jun-28-2021, 05:00 AM
Last Post: buran
  Concatenation ?? ridgerunnersjw 1 1,740 Sep-26-2020, 07:29 PM
Last Post: deanhystad
  FutureWarning: pandas.util.testing is deprecated buunaanaa 3 5,132 May-17-2020, 07:43 AM
Last Post: snippsat
  Combining two strings together (not concatenation) DreamingInsanity 6 3,173 Mar-29-2019, 04:32 PM
Last Post: DreamingInsanity
  Handling null or empty entries from Entry Widget KevinBrown 1 2,320 Mar-17-2019, 04:22 PM
Last Post: perfringo
  append elements into the empty dataframe jazzy 0 2,122 Sep-26-2018, 07:26 AM
Last Post: jazzy
  Regarding concatenation of a list to a string Kedar 2 22,821 Aug-19-2018, 12:57 PM
Last Post: ichabod801

Forum Jump:

User Panel Messages

Announcements
Announcement #1 8/1/2020
Announcement #2 8/2/2020
Announcement #3 8/6/2020