bug in C++ layer (null pointer)

To
reproduce, compile the Metwear-CppAPI with debug enabled (Compiled
as UWP library and added a reference to it from the C++ lib).

Then use the
CS template to log accel + gyro data (100 Hz). 

We also
have an GPIO signal going in parallel too that we use to stop logging.

 As you can see in the
screenshot, it crashes often in the C++ layer with a null pointer exception at
state->create_next(..)

image


https://ibb.co/eeubrQ

Comments

  • edited May 2017
    Hmm, might be memory corruption in your C++ lib... (weird values in *state).

  • edited May 2017
    Please post the MetaWear code you are running.  Also, when does this error occur and does it consistently happen?
  • It happens very consistently after two/three secs of logging.

    Preudocode:

    click-button-handler() {
           setup-logging-signals
           while (1) {
    start-logging
    delay 2 secs
    stop-logging
    }
    }

    setup_logging_signals() {
    }

    start_logging() {
                    mbl_mw_logging_clear_entries(getBoard());
                    mbl_mw_logging_start(getBoard(), 1);

                    if (useAccel)
                    {
                        mbl_mw_acc_enable_acceleration_sampling(getBoard());
                        mbl_mw_acc_start(getBoard());
                    }

                    if (useGyro)
                    {
                        mbl_mw_gyro_bmi160_enable_rotation_sampling(getBoard());
                        mbl_mw_gyro_bmi160_start(getBoard());
                    }    
    }


  • void stop_logging() {
                    if (useGyro)
                    {
                        mbl_mw_gyro_bmi160_stop(getBoard());
                        mbl_mw_gyro_bmi160_disable_rotation_sampling(getBoard());
                    }

                    if (useAccel)
                    {
                        mbl_mw_acc_stop(getBoard());
                        mbl_mw_acc_disable_acceleration_sampling(getBoard());
                    }

                    mbl_mw_logging_stop(getBoard());
    }
  • with setup_signals:
                mbl_mw_logging_clear_entries(getBoard());
                mbl_mw_metawearboard_set_time_for_response(getBoard(), 500);
                
                IntPtr accSignal = getAccSignal(false);
                IntPtr gyroSignal = getGyroSignal();

                config_acc(accSignal);
                config_gyro(gyroSignal);
                setup_GPIO();

                var logAccFn = new Fn_IntPtr((IntPtr _accLogger) =>
                {
                    accLogger = _accLogger;
                    if (accLogger.ToInt64() != 0)
                    {
                        mbl_mw_logger_subscribe(accLogger, accDataHandler);
                    }
                    else
                    {
                        System.Diagnostics.Debug.WriteLine("ERROR: failed to create the acc-logger");
                    }
                });

                var logGyroFn = new Fn_IntPtr((IntPtr _gyroLogger) =>
                {
                    gyroLogger = _gyroLogger;
                    if (gyroLogger.ToInt64() != 0)
                    {
                        mbl_mw_logger_subscribe(gyroLogger, gyroDataHandler);
                    }
                    else
                    {
                        System.Diagnostics.Debug.WriteLine("ERROR: failed to create the gyro-logger");
                    }
                });

                if (useAccel)
                    mbl_mw_datasignal_log(accSignal, logAccFn);
                if (useGyro)
                    mbl_mw_datasignal_log(gyroSignal, logGyroFn);

  • (shortened as much as possible, removed printlns and such)

  • I cannot reproduce this bug on my side with VS2017.  Have you tried using the DLLs hosted on Nuget?  

    Also, what VS are you using and are you running your UWP app on a desktop or perhaps a Pi?
  • Hi Eric, 

    tried this on a win 10 desktop with latest patches, VS2017, DLL from nuget originally. Compiled the metawear C++ lib locally to get debugging going. Accel, gyro are logging simultaneously, a GPIO subscription set up before logging starts. It looks like a BLE packet arrives, its processed and then added to some list in your C++ layer. There is breaks. Is there perhaps a race condition in adding objects to this list? This makes reproducing it hard unfortunately, its the nature of race conditions...
  • edited May 2017
    Hi Eric, 

    I've a better idea: I created a minimal version of the app that still reproduces it.
    Simply press the start-logging button and see it crash in the C++ layer.

    Where can I upload the source code to/email it to?

    Best regards, 
    Ronald.
  • Put it on GitHub
  • App says i need to copy a 'loops.json' file to the 'LocalState' folder.
  • Hi Eric, 

    missed some code while doing minimization. Removed it now. Can you try again?

    Cheers,
    Ronald.
  • I am able to run your app and create a csv file.  Try decreasing the max connection interval to 7.5ms or increase the time for response to 1000ms.
  • Hi Eric, 

    good to see that you've got it running. I assume that you've clicked start/stop streaming which works ok here as well. 

    Now please press 'start logging' and see the crash. Windows should popup an info box saying that memory is corrupted. When you look closer in visual studio with debug enabled you'll see it happening in your C++ layer.

    Cheers,
    Ronald.
  • edited May 2017
    No, I can run logging just fine on my PC.  As previously mentioned, try decreasing the max connection interval or increasing the time for response.

    You should give MetaBase a try and see if it also crashes when trying to log data:
  • Hi Eric, 

    metabase (installed from the windows store just now) also crashes....
    I had one successful run where it even downloaded something and then it's gone.
    This is with the max. speed metawear gives as our usecase requires it.

    Another I think related symptom is that I can do streaming  (not logging) for a few times (start-stop three or so times)
    and then it won't send me data on the 4th start. For example, for the accelerometer I set a breakpoint in the handler 
    at the 4th or so run and I see it's no longer hit.
    Curiously, compiling it all with debug-enabled seems to make it a little more stable so I suspect a race condition in the C++ layer.

    Cheers,
    Ronald.
  • Hrm, well lets go back to my earlier suggestion from this post so we at leats see what bytes were being sent back and forth lading up to the crash.

  • As requested:
    - WRITE_CHAR: dumps managedArray in writeCharacteristic
    - RESPONSE: dumps the response array in notifyHandler

     use gyro now: True
     use accel now: True
     use spill now: True
    WRITE_CHAR: 11 13 1 
    WRITE_CHAR: 11 13 1 
    The thread 0x20e0 has exited with code 0 (0x0).
    WRITE_CHAR: 11 9 255 255 255 255 
    WRITE_CHAR: 11 9 255 255 255 255 
    high-frequency enabled!
    WRITE_CHAR: 3 3 40 3 
    WRITE_CHAR: 3 3 40 3 
    WRITE_CHAR: 19 3 40 3 
    WRITE_CHAR: 19 3 40 3 
    WRITE_CHAR: 5 10 1 
    WRITE_CHAR: 5 10 1 
    WRITE_CHAR: 5 3 0 
    WRITE_CHAR: 5 3 0 
    WRITE_CHAR: 5 9 0 2 
    WRITE_CHAR: 5 9 0 2 
    WRITE_CHAR: 5 11 0 1 
    WRITE_CHAR: 5 11 0 1 
    gpio-initialized
    GOOD: setup data signals, now registering logger
    WRITE_CHAR: 11 2 3 4 255 96 
    WRITE_CHAR: 11 2 3 4 255 96 
    WRITE_CHAR: 11 2 3 4 255 36 
    WRITE_CHAR: 11 2 3 4 255 36 
    WRITE_CHAR: 11 13 1 
    WRITE_CHAR: 11 13 1 
    RESPONSE: 11 2 6 
    WRITE_CHAR: 11 9 255 255 255 255 
    WRITE_CHAR: 11 9 255 255 255 255 
    RESPONSE: 11 2 7 
    WRITE_CHAR: 11 11 1 
    WRITE_CHAR: 11 11 1 
    GOOD: acc-logger ready
    WRITE_CHAR: 11 1 1 
    WRITE_CHAR: 11 1 1 
    WRITE_CHAR: 11 2 19 5 255 96 
    WRITE_CHAR: 11 2 19 5 255 96 
    WRITE_CHAR: 11 2 19 5 255 36 
    WRITE_CHAR: 11 2 19 5 255 36 
    WRITE_CHAR: 3 2 1 0 
    WRITE_CHAR: 3 2 1 0 
    WRITE_CHAR: 3 1 1 
    WRITE_CHAR: 3 1 1 
    WRITE_CHAR: 19 2 1 0 
    WRITE_CHAR: 19 2 1 0 
    WRITE_CHAR: 19 1 1 
    WRITE_CHAR: 19 1 1 
    GOOD:----------------- logging for 2000 ms
    The thread 0x2078 has exited with code 0 (0x0).
  • Some  more info: it crashes at:  state->create_next(true) at that point in mbl_mw_datasignal_log.

    Callstack:
    -----------------
    > MetaWear.Win32.dll!mbl_mw_datasignal_log::__l2::void <lambda>(void)::__l2::<lambda>() Line 478 C++
      MetaWear.Win32.dll!std::_Invoker_functor::_Call<void <lambda>(void) & __ptr64>(mbl_mw_datasignal_log::__l2::void <lambda>(void)::__l2::void <lambda>(void) & _Obj) Line 1534 C++
      MetaWear.Win32.dll!std::invoke<void <lambda>(void) & __ptr64>(mbl_mw_datasignal_log::__l2::void <lambda>(void)::__l2::void <lambda>(void) & _Obj) Line 1534 C++
      MetaWear.Win32.dll!std::_Invoker_ret<void,1>::_Call<void <lambda>(void) & __ptr64>(mbl_mw_datasignal_log::__l2::void <lambda>(void)::__l2::void <lambda>(void) & <_Vals_0>) Line 1560 C++
      MetaWear.Win32.dll!std::_Func_impl<void <lambda>(void),std::allocator<int>,void>::_Do_call() Line 212 C++
      MetaWear.Win32.dll!std::_Func_class<void>::operator()() Line 278 C++
      MetaWear.Win32.dll!ThreadPool::schedule::__l2::<lambda>() Line 38 C++
      MetaWear.Win32.dll!std::_Invoker_functor::_Call<void <lambda>(void) >(ThreadPool::schedule::__l2::void <lambda>(void) && _Obj) Line 1534 C++
      MetaWear.Win32.dll!std::invoke<void <lambda>(void) >(ThreadPool::schedule::__l2::void <lambda>(void) && _Obj) Line 1534 C++
      MetaWear.Win32.dll!std::_LaunchPad<std::unique_ptr<std::tuple<void <lambda>(void) >,std::default_delete<std::tuple<void <lambda>(void) > > > >::_Execute<0>(std::tuple<void <lambda>(void) > & _Tup, std::integer_sequence<unsigned __int64,0> __formal) Line 241 C++
      MetaWear.Win32.dll!std::_LaunchPad<std::unique_ptr<std::tuple<void <lambda>(void) >,std::default_delete<std::tuple<void <lambda>(void) > > > >::_Run(std::_LaunchPad<std::unique_ptr<std::tuple<void <lambda>(void) >,std::default_delete<std::tuple<void <lambda>(void) > > > > * _Ln) Line 247 C++
      MetaWear.Win32.dll!std::_LaunchPad<std::unique_ptr<std::tuple<void <lambda>(void) >,std::default_delete<std::tuple<void <lambda>(void) > > > >::_Go() Line 233 C++
      MetaWear.Win32.dll!std::_Pad::_Call_func(void * _Data) Line 211 C++
      [External Code]


    value of state:
    ------------------

    - state shared_ptr {log_time_references={ size=1 } latest_tick={ size=0 } data_loggers={ size=2 } ...} [3 strong refs] [make_shared] std::shared_ptr<LoggerState>
    - [ptr] 0x0000022e022df000 {log_time_references={ size=1 } latest_tick={ size=0 } data_loggers={ size=2 } ...} LoggerState *
    + AsyncCreator {pending_fns={queue_mutex={...} elements={ size=1 } } timeout=0x0000022e0275a570 {cancelled=false } } AsyncCreator
    + log_time_references { size=1 } std::unordered_map<unsigned char,TimeReference,std::hash<unsigned char>,std::equal_to<unsigned char>,std::allocator<std::pair<unsigned char const ,TimeReference> > >
    + latest_tick { size=0 } std::unordered_map<unsigned char,unsigned int,std::hash<unsigned char>,std::equal_to<unsigned char>,std::allocator<std::pair<unsigned char const ,unsigned int> > >
    + data_loggers { size=2 } std::unordered_map<unsigned char,MblMwDataLogger *,std::hash<unsigned char>,std::equal_to<unsigned char>,std::allocator<std::pair<unsigned char const ,MblMwDataLogger *> > >
    + next_logger 0x0000000000000000 <NULL> MblMwDataLogger *
    + log_download_handler {received_progress_update=0xcdcdcdcdcdcdcdcd received_unknown_entry=0xcdcdcdcdcdcdcdcd received_unhandled_entry=...} MblMwLogDownloadHandler
    log_download_notify_progress -431602080. float
    n_log_entries 3452816845 unsigned int

    Notice how the value of n_lo_entries = 0xcdcdcdcd ??

  • edited May 2017
    n_log_entries isn't used at that point and isn't set until you start a log download so its value is irrelevant.  As you can see, it is the same values as the two function pointers defined above it, which are also not set until a log download.

    Looking at the bytes exchanged between your Win10 device and board, your run should have printed "ERROR: failed to create the gyro-logger" (DeviceSetup.xaml.cs:L624) but it doesn't because setup_logging_signals does not wait for both mbl_mw_datasignal_log calls to finish.  This is probably why your app is crashing as setup_logging_signals exits before the log calls finish and both the logGyroFn and logAccFn callback functions are garbage collected resulting in bogus memory addresses.

    For this specific run, your board no longer has available loggers to allocate.  If this happens, you can either reset the board or call mbl_mw_metawearboard_tear_down.


This discussion has been closed.