Latest Entries »

Learn how to implement loss functions in TensorFlow in this article by Nick McClure, a senior data scientist at PayScale with a passion for learning and advocating for analytics, machine learning, and artificial intelligence.

Loss functions are very important for machine learning algorithms. They measure the distance between the model outputs and the target (truth) values. This article delves into various loss function implementations in TensorFlow.

Getting ready

In order to optimize your machine learning algorithms, you need to evaluate the outcomes. Evaluating outcomes in TensorFlow depends on specifying a loss function. A loss function tells TensorFlow how good or bad the predictions are, compared with the desired result. In most cases, you’ll have a set of data and a target on which to train your algorithm. The loss function compares the target with the prediction and gives a numerical distance between the two.

This article will cover the main loss functions that you can implement in TensorFlow. To see how the different loss functions operate, start a computational graph and load matplotlib, a Python plotting library using the following code:

import matplotlib.pyplot as plt import tensorflow as tf

How to do it…

  1. First, look at loss functions for regression, which means predicting a continuous dependent variable. Create a sequence of your predictions and a target as a tensor using the below code. You can output the results across 500 x values between -1 and 1 later in the article.
x_vals = tf.linspace(-1., 1., 500) target = tf.constant(0.)
  1. The L2 norm loss is also known as the Euclidean loss function. It is just the square of the distance to the target. Here, you’ll compute the loss function as if the target is zero. The L2 norm is a great loss function because it is curved near the target and algorithms can use this fact to converge to the target more slowly, the closer it gets to zero. You can implement this as follows:
l2_y_vals = tf.square(target - x_vals) l2_y_out =
  1. The L1 norm loss is also known as the absolute loss function. Instead of squaring the difference, take the absolute value. The L1 norm is better for outliers than the L2 norm because it is not as steep for larger values. One issue to be aware of is that the L1 norm is not smooth at the target, and this can result in algorithms not converging well.
l1_y_vals = tf.abs(target - x_vals) l1_y_out =
  1. Pseudo-Huber loss is a continuous and smooth approximation to the Huber loss function. This loss function attempts to take the best of the L1 and L2 norms by being convex near the target and less steep for extreme values. The form depends on an extra parameter, delta, which dictates how steep it will be. Plot two forms, delta1 = 0.25and delta2 = 5, to show the difference, as follows:
delta1 = tf.constant(0.25) phuber1_y_vals = tf.multiply(tf.square(delta1), tf.sqrt(1. +                          tf.square((target - x_vals)/delta1)) - 1.) phuber1_y_out = delta2 = tf.constant(5.) phuber2_y_vals = tf.multiply(tf.square(delta2), tf.sqrt(1. +                          tf.square((target - x_vals)/delta2)) - 1.) phuber2_y_out =

Now, move on to loss functions for classification problems. Classification loss functions are used to evaluate loss when predicting categorical outcomes. Usually, the output of your model for a class category is a real number between 0 and 1. Then, if the number is above the cutoff, choose a cutoff (0.5 is commonly chosen) and classify the outcome as being in that category. Here, consider the various loss functions for categorical outputs:

  1. You’ll need to redefine your predictions (x_vals) and target. Save the outputs and plot them in the next section:
x_vals = tf.linspace(-3., 5., 500) target = tf.constant(1.) targets = tf.fill([500,], 1.)
  1. Hinge loss is mostly used for support vector machines but can be used in neural networks as well. It is meant to compute a loss among two target classes, 1 and -1. In the following code, you’ll use the target value 1, so the closer your predictions are to 1, the lower the loss value:
hinge_y_vals = tf.maximum(0., 1. - tf.multiply(target, x_vals)) hinge_y_out =
  1. Cross-entropy loss for a binary case is also sometimes referred to as the logistic loss function. It comes about when you are predicting the two classes 0 or 1. You may wish to measure the distance from the actual class (0 or 1) to the predicted value, which is usually a real number between 0 and 1. To measure this distance, use the cross-entropy formula from information theory, as follows:
xentropy_y_vals = - tf.multiply(target, tf.log(x_vals)) - tf.multiply((1. - target), tf.log(1. - x_vals)) xentropy_y_out =
  1. Sigmoid cross-entropy loss is similar to the previous loss function, except you transform the x values using the sigmoid function before you put them in the cross-entropy loss, as follows:
xentropy_sigmoid_y_vals = tf.nn.sigmoid_cross_entropy_with_logits_v2(logits=x_vals, labels=targets) xentropy_sigmoid_y_out =
  1. Weighted cross-entropy loss is a weighted version of the sigmoid cross-entropy loss. Provide a weight on the positive target. For example, weigh the positive target by 0.5, as follows:
weight = tf.constant(0.5) xentropy_weighted_y_vals = tf.nn.weighted_cross_entropy_with_logits(logits=x_vals, targets=targets, pos_weight=weight) xentropy_weighted_y_out =
  1. Softmax cross-entropy loss operates on non-normalized outputs. This function is used to measure a loss when there is only one target category instead of multiple categories. Because of this, the function transforms the outputs into a probability distribution via the softmax function. Then, it computes the loss function from a true probability distribution, as follows:
unscaled_logits = tf.constant([[1., -3., 10.]]) target_dist = tf.constant([[0.1, 0.02, 0.88]]) softmax_xentropy = tf.nn.softmax_cross_entropy_with_logits_v2(logits=unscaled_logits, labels=target_dist) print( [ 1.16012561]
  1. Sparse softmax cross-entropy loss is the same as the previous, except instead of the target being a probability distribution, it is an index of the category that is true. Instead of a sparse all-zero target vector with one value of 1, pass in the index of the category (that is, the true value) as follows:
unscaled_logits = tf.constant([[1., -3., 10.]]) sparse_target_dist = tf.constant([2]) sparse_xentropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=unscaled_logits, labels=sparse_target_dist) print( [ 0.00012564]

How it works…

This code shows how to use matplotlib to plot the regression loss functions:

x_array = plt.plot(x_array, l2_y_out, 'b-', label='L2 Loss') plt.plot(x_array, l1_y_out, 'r--', label='L1 Loss') plt.plot(x_array, phuber1_y_out, 'k-.', label='P-Huber Loss (0.25)') plt.plot(x_array, phuber2_y_out, 'g:', label='P-Huber Loss (5.0)') plt.ylim(-0.2, 0.4) plt.legend(loc='lower right', prop={'size': 11})

You’ll get the following plot as output from the preceding code:


Figure 4: Plotting various regression loss functions

And here is how to use matplotlib to plot the various classification loss functions:

x_array = plt.plot(x_array, hinge_y_out, 'b-''', label='Hinge Loss''') plt.plot(x_array, xentropy_y_out, 'r--''', label='Cross' Entropy Loss') plt.plot(x_array, xentropy_sigmoid_y_out, 'k-.''', label='Cross' Entropy Sigmoid Loss') plt.plot(x_array, xentropy_weighted_y_out, g:''', label='Weighted' Cross Enropy Loss (x0.5)') plt.ylim(-1.5, 3) plt.legend(loc='lower right''', prop={'size''': 11})

You’ll get the following plot from the preceding code:


Figure 5: Plots of classification loss functions

There’s more…

Here is a table summarizing the different loss functions covered in this article:

Loss function Use Benefits Disadvantages
L2 Regression More stable Less robust
L1 Regression More robust Less stable
Pseudo-Huber Regression More robust and stable One more parameter
Hinge Classification Creates a max margin for use in SVM Unbounded loss affected by outliers
Cross-entropy Classification More stable Unbounded loss, less robust

The remaining classification loss functions all have to do with the type of cross-entropy loss. The cross-entropy sigmoid loss function is for use on unscaled logits and is preferred over computing the sigmoid and then the cross-entropy. This is because TensorFlow has better built-in ways to handle numerical edge cases. The same goes for softmax cross-entropy and sparse softmax cross-entropy.

Most of the classification loss functions described here are for two-class predictions. This can be extended to multiple classes by summing the cross-entropy terms over each prediction/target.

There are several other metrics to look at when evaluating a model. Here’s a list of some more to consider:

Model metric Description
R-squared (coefficient of determination) For linear models, this is the proportion of variance in the dependent variable that is explained by the independent data. For models with a larger number of features, consider using adjusted R squared.
Root mean squared error For continuous models, this measures the difference between prediction and actual via the square root of the average squared error.
Confusion matrix For categorical models, look at a matrix of predicted categories versus actual categories. A perfect model has all the counts along the diagonal.
Recall For categorical models, this is the fraction of true positives over all predicted positives.
Precision For categorical models, this is the fraction of true positives over all actual positives.
F-score For categorical models, this is the harmonic mean of precision and recall.

If you found this article interesting, you can explore Nick McClure’s TensorFlow Machine Learning Cookbook – Second Edition to skip the theory and get the most out of Tensorflow to build production-ready machine learning models. TensorFlow Machine Learning Cookbook – Second Edition will teach you how to use TensorFlow for complex data computations and allow you to dig deeper and gain more insights into your data than ever before.

Android applications are bundled and distributed as apk(s), aka Android application package.To make an APK file, a program for Android is first compiled, and then all of its parts are packaged into one file. An APK file contains all of that program’s code (such as .dex files), resources, assets, certificates, and manifest file. As is the case with many file formats, APK files can have any name needed, provided that the file name ends in “.apk”.

Apk Contents:

An APK file is an archive that usually contains the following files and directories:

  • META-INF directory:
    • MANIFEST.MF: the Manifest file
    • CERT.RSA: The certificate of the application.
    • CERT.SF: The list of resources and SHA-1 digest of the corresponding lines in the MANIFEST.MF file; for example:
 Signature-Version: 1.0
 Created-By: 1.0 (Android)
 SHA1-Digest-Manifest: wxqnEAI0UA5nO5QJ8CGMwjkGGWE=
 Name: res/layout/exchange_component_back_bottom.xml
 SHA1-Digest: eACjMjESj7Zkf0cBFTZ0nqWrt7w=
 Name: res/drawable-hdpi/icon.png
 SHA1-Digest: DGEqylP8W0n0iV/ZzBx3MW0WGCA=
  • lib: the directory containing the compiled code that is specific to a software layer of a processor, the directory is split into more directories within it:
    • armeabi: compiled code for all ARM based processors only
    • armeabi-v7a: compiled code for all ARMv7 and above based processors only
    • arm64-v8a: compiled code for all ARMv8 arm64 and above based processors only[7][8]
    • x86: compiled code for x86 processors only
    • x86_64: compiled code for x86 64 processors only
    • mips: compiled code for MIPS processors only
  • res: the directory containing resources not compiled into resources.arsc (see below).
  • assets: a directory containing applications assets, which can be retrieved by AssetManager.
  • AndroidManifest.xml: An additional Android manifest file, describing the name, version, access rights, referenced library files for the application. This file may be in Android binary XML that can be converted into human-readable plaintext XML with tools such as AXMLPrinter2, android-apktool, or Androguard.
  • classes.dex: The classes compiled in the dex file format understandable by the Dalvik virtual machine
  • resources.arsc: a file containing precompiled resources, such as binary XML for example.

Decompilation process:

Our prerequisite would be these 3 tools:

  • dex2jar: Used to convert the apk to jar file. Can be downloaded from here.

  • JD-GUI: Used to view the contents/source from the jar file decompiled in previous step. Details are here.

  • apktool: For reverse engineering the apk to extract files and folders. This can be used to extract the manifest individually and then reading from it. It is available here for download.

dex2jar and JD-GUI are used together. dex2jar converts apk to jar file and JD-GUI provides the editor to browse that jar file. To use dex2jar:

  1. Download dex2jar from here and extract it to a separate folder.
  2.  Execute the following command to decompile an apk:

    sh testapp.apk

  3. It might happen that terminal might show you a permissions error related to while executing step 2, if that happens then provide with appropriate permissions by executing:

    sudo chmod +x

  4. Post above steps, testapp.jar should be generated which can be opened and browsed via JD-GUI. This file contains all the decompiled code(.class files)


We are already able to browse the source code using dex2jar and JD-GUI, however, another important tool in the arsenal is apktool, which is a tool for reverse engineering 3rd party, closed, binary Android apps. It can decode resources to nearly original form and rebuild them after making some modifications. It also makes working with an app easier because of the project like file structure and automation of some repetitive tasks like building apk, etc. Apps/Apks. Decoding of an apk via apktool can be done by using following command:

apktool d test.apk

Repackaging as an apk can be done by:

apktool b test

JD-GUI, d2j and apktool is the essential tooling required to get an effective and deep insight into 3rd party apps which often exist as black boxes. As shown above, the usage is simple and pretty straight forward. I would request you to share your inputs and experiences in comments with these tools or any others that you might have explored for decoding Android or any other platform apps.


Ensuring Quality in iOS Apps

Code coverage is an important metric that is a measure used to describe the degree to which the source of a program is executed when a particular test suite runs. A program with high code coverage, measured as a percentage, has had more of its source code executed during testing which suggests it has a lower chance of containing undetected software bugs compared to a program with low code coverage. Many different metrics can be used to calculate code coverage; some of the most basic are the percent of program subroutines and the percent of program statements called during execution of the test suite.

XCode allows to generate code coverage artefacts for any project, which it does by using llvm-cov internally. Although the code coverage metric/data is available in the XCode itself but often it is a requirement to transfer/export this data for an input to a third party tool so that the metrics can be made available to all stakeholder(eg: business). We will use this blog post to lay out the mechanism to export this profiling data regarding code coverage as a text file, so that other tools(Third Party, eg: TICS) can use this as input.


  • Make sure that the code coverage generation is enabled in XCode and the metrics are visible in xcode itself. Much has already been said about this and some excellent documentation is available here.


  • Once you have made sure that the coverage metrics is enabled, run the tests.
  • Post the test run the console should tell you the location(path) where the profiling data was generated. In my case the path was located here:screen-shot-2017-01-18-at-10-51-04
  • Translate Coverage.Profdata(This should be available at the path which we identified in the step above). The translation is done via this command:

    /usr/bin/xcrun llvm-cov show -instr-profile /Users/path/to/DerivedData/Build/Intermediates/CodeCoverage/Coverage.profdata  /Users/path/to/DerivedData/Build/Intermediates/CodeCoverage/Products/Debug-iphonesimulator/ > /Users/path/where/you/want/imported/data/llvm-cov-show.txt

  • By now, at the provided path in the command above, a text file with llvm-cov coverage data(llvm-cov-show.txt) would have been generated. This file is the end product and can be used as input to the any of the third part tools.

Android applications are bundled and installed as android application package(apk) on an Android device. More often then not the stakeholders of an application require confidence that the application that is being developed is secure and does not pose an unacceptable level of risk. Drozer is just the tool for that.

Warning: This is going to be a comprehensive, long and fairly technical post. Once again, nothing is left to reader’s/user’s imagination.

Drozer is a security audit and attack framework for Android which works by allowing you to interact with the Dalvik VM, other apps’ IPC endpoints and the underlying OS. Drozer provides tools to help you use and share public exploits for Android. For remote exploits, it can generate shellcode to help you to deploy the drozer Agent as a remote administrator tool, with maximum leverage on the device.

Now let us go ahead and configure drozer. We will be using a Mac OSX based System, however the instructions are more or less aligned for a Windows based system too. Some prerequisites for configuring dozer are:

  1. Java Development Kit (JDK) 1.6 – (very important! See configuration of Java 6 below for reason)
  2. Python 2.7.
  3. Android SDK
  4. You should ensure that each of these tools are on your path: adb, java

Let us move from easiest to the most difficult prerequisite configuration:

Configuring adb:

On Mac this can be done via homebrew by executing the following at the terminal:

Configuring Java 6:

It is very important that Java 1.6 is installed and used. This is because Android bytecode is only compliant to version 1.6 and not higher versions. Making use of any version of javac other than 1.6 will result in errors during compilation that look similar to the following:

bad class file magic (cafebabe) or version (0033.0000)

You can download legacy Java6 for OSX from here:

Once you have downloaded and installed it. It is necessary to make sure that javac -version returns 1.6.*  on the terminal. See this link for information on how java_home is setup on Mac OSX. If Java 6 was correctly installed at /Library/Java/, then it can be made primary by using the following command at the terminal:

export JAVA_HOME=/Library/Java/JavaVirtualMachines/1.6.0.jdk/Contents/Home/

All the important configurations are done. Python is installed by default on Mac OSX and is accessible via the terminal.

Installing Drozer:

Once the configurations are done, we can now clone drozer directly form git itself. All commands will be run on the terminal.

  1. At appropriate location(a separate folder preferably), clone the drozer repo:

 git clone

  1. We will setup drozer from the repo directly. Once the repo is cloned, go to the root of the repo by: cd drozer
  2.  Setup the PYTHONPATH to include/find the drozer:

    export PYTHONPATH=/path/to/drozer/src/:$PYTHONPATH

  3. Now you can run:

    python build

  4. After finishes, if you type drozer in the terminal then the command should be executed successfully.

Hacking with Drozer:

Before you start hacking, we will need an Android device(although the steps are valid for emulator too). Also, we will need to install drozer agent and sieve apk(s) on the device. The purpose of these: sieve is a vulnerable application that we would be testing. Agent app is the app which is our gateway into the dalvik vm of the android device we are connected to.The Agent opens a ServerSocket, on port 31415 by default, and awaits incoming connections and commands via those connections.

Both the apps are available here:

To connect to drozer agent app:

  1. Connect the android device to the computer via usb.
  2. Launch the agent app on the android device.
  3. Push the “on” button on the app screen. This will make it listen for incoming tcp connections on port 31415
  4. On the terminal type: adb forward tcp:31415 tcp:31415
  5. And after that:

    drozer console connect

    This should launch drozer console:


 As soon as this happens, we can say safely that we are connected to the android device. The screen of the drozer app should look like the one below and the connectivity indicator fir embedded server should have turned green:


The drozer Console is a command line environment, which should be familiar to anybody who has used a bash shell or Windows terminal. Now we can go ahead and assess sieve.

The first step in assessing Sieve is to find it on the Android device. Apps installed on an Android device are uniquely identified by their ‘package name’. We can use the `app.package.list` command to find the identifier for Sieve:

dz> run app.package.list -f sieve


We can ask drozer to provide some basic information about the package using the `` command:

dz> run -a com.mwr.example.sieve

Package: com.mwr.example.sieve

Process Name: com.mwr.example.sieve

Version: 1.0

Data Directory: /data/data/com.mwr.example.sieve

APK Path: /data/app/com.mwr.example.sieve-2.apk

UID: 10056

GID: [1028, 1015, 3003]

Shared Libraries: null

Shared User ID: null

Uses Permissions:

– android.permission.READ_EXTERNAL_STORAGE

– android.permission.WRITE_EXTERNAL_STORAGE

– android.permission.INTERNET

Defines Permissions:

– com.mwr.example.sieve.READ_KEYS

– com.mwr.example.sieve.WRITE_KEYS

This shows us a number of details about the app, including the version, where the app keeps its data on the device, where it is installed and a number of details about the permissions allowed to the app.

We can ask drozer to report on Sieve’s attack surface:

dz> run app.package.attacksurface com.mwr.example.sieve

Attack Surface:

  3 activities exported

  0 broadcast receivers exported

  2 content providers exported

  2 services exported

    is debuggable

We can drill deeper into this attack surface by using some more specific commands. For instance, we can ask which activities are exported by Sieve:

dz> run -a com.mwr.example.sieve

Package: com.mwr.example.sieve




The PWList activity is exported and does not require any permission, we can ask drozer to launch it:

dz> run app.activity.start –component

com.mwr.example.sieve com.mwr.example.sieve.PWList

This formulates an appropriate Intent in the background, and delivers it to the system through the `startActivity` call. Sure enough, we have successfully bypassed the authorization and are presented with a list of the user’s credentials:


Next we can gather some more information about the content providers exported by the app. Once again we have a simple command available to request additional information:

dz> run -a com.mwr.example.sieve

Package: com.mwr.example.sieve

  Authority: com.mwr.example.sieve.DBContentProvider

    Read Permission: null

    Write Permission: null

    Content Provider: com.mwr.example.sieve.DBContentProvider

    Multiprocess Allowed: True

    Grant Uri Permissions: False

    Path Permissions:

     Path: /Keys


       Read Permission: com.mwr.example.sieve.READ_KEYS

       Write Permission: com.mwr.example.sieve.WRITE_KEYS

  Authority: com.mwr.example.sieve.FileBackupProvider

    Read Permission: null

   Write Permission: null

   Content Provider: com.mwr.example.sieve.FileBackupProvider

   Multiprocess Allowed: True

   Grant Uri Permissions: False


This shows the two exported content providers that the attack surface alluded to in Section 3.3. It confirms that these content providers do not require any particular permission to interact with them, except for the /Keys path in the DBContentProvider.

Also, We identified that Sieve exported two services. As with activities and content providers, we can ask for a little more detail:

dz> run -a com.mwr.example.sieve

Package: com.mwr.example.sieve


    Permission: null


    Permission: null

Once again, these services are exported to all other apps, with no permission required to access them.

And that is it. We did configure, install and scratched the surface of an app using dozer. There is much more to dozer which we would be covering in subsequent posts. I hope this was useful and informative

Part 1 of this post outlined the setup of sqlmap and finding the databases on a backend. We were able to discover that there were 2 databases lying on the backend namely information_schema and webscantest as shown below,


Let’s try to find out what tables are present in the webscantest database. We will execute the following the command to do so,


Once again, after spewing a lot of output, it presents us with the list of the following tables,


Now, just stop here at a second and reflect on the fact that we started with a web app with sql storage and now we are seeing how the data in the app is structurally stored. Our next step should obviously be to find the data in the accounts table, guessing by the name of which, should be storing user account details, via the following command,


And Voila, We have the keys to the castle :).


We will not cover breaking the password hash in this writeup, but I’ll add a hint: Sqlmap can help you with that too.

This covers how sqlmap actually works and what is needed to break into a vulnerable sql storage. Of course, this was done on a test web app but all the techniques still hold good for all the vulnerable sql implementations out there.

A word of caution: Sqlmap is a powerful tool and with great power comes great responsibility. Sqlmap should be used with caution and for responsible disclosure only.

Sqlmap is a framework which is designed to expose vulnerabilities for an sql based storage, a common thing in various web apps. Sqlmap is available here: The wiki and the documentation itself still leave a learning curve for the newbie, so we decided that a writeup could be really helpful. This will be a long post and leaves nothing to imagination, hence it will be in 2 parts.

Numerous web based applications use Sql based storage(eg: MySql) for their backend. These web based applications usually use this storage/database as a part of something known as the LAMP stack which is an acronym for an open source platform on which these web based application are built on namely, L(Linux), A(Apache), M(MySql), P(Php/Perl/Python).

Sqlmap is designed to find flaws/expose vulnerabilities in web applications which interact and use database servers based on Sql(MySQL, Oracle, PostgreSQL, Microsoft SQL Server, Microsoft Access, IBM DB2, SQLite, Firebird, Sybase, SAP MaxDB, HSQLDB and Informix). A quick and dirty way to to check whether a web application uses a Sql implementation(eg: MySql) or not is to pay close attention to the url itself. If the url looks/matches the ones below:

then there is very good chance that we can use sqlmap here.

For those familiar with google hacking/dorks, Vulnerable web apps can easily be found by running the following google dorks queries:

  • inurl:index.php?id=
  • inurl:gallery.php?id=
  • inurl:post.php?id=
  • inurl:article?id=

These queries can literally end up in millions of vulnerable findings.

For starters, On Mac, Sqlmap can simply be downloaded, extracted in to a separate directory and then executed just like any other python script. Finding will be the first step, exploiting and exposing is the second.

Finding the database supported by the backend:

For the purpose of demonstration we would be referring to this website: To find the database being used, enter the full url as a parameter to the sqlmap script as shown below,


The script will ask a few questions and will spew out a lot of output and just before it ends it will reveal the full implementation details/stack of the web application as shown below,


This is a standard LAMP Stack. So far, we have been able to find the implementation information, it would be really nice if we were also able to find out what is in the database. For this, we will need to execute the script with –dbs command as shown below,


Once again the script spews a lot of output and in the end it yields the name of the 2 databases contained in the backend,


This completes the search phase, we have extracted the necessary information about the web app. Also, we now know that it is comprising of 2 databases, namely information_schema and webscantest. In the second part we will delve deeper into extracting the data from these databases.

Security is now a prime concern when writing the apps for handheld devices, specifically on an Android based system where a root user can gain access to almost everything which is there or was there on the device(Now please reimagine listing your device on craiglist for a sale :)).

A quick refresher, Mobile Devices provide the following mechanisms and storages to securely store information(Since username and passwords are the most stored ones, we would be specifically covering and referring those):

The following Storages are available:

  • Sqlite
  • Shared Preferences
  • In Code
  • Keystore/Keychain storage

The following mechanisms are most followed(In the order of least to most recommended):

  • Store as cleartext in Sqlite/Shared preferences
  • Store encrypted using a symmetric key
  • Using Android Key Store
  • Store encrypted using asymmetric key

Cleartext storage is least recommended as there is no protection of user’s data. Anyone can extract the prefs/sqlite file using the adb backup command. There can be argument to set android:allowBackup flag to false in AndroidManifest.xml, but since Android is Linux based even then root can have access to all files.

A much better idea is to encrypt the data before it is stored. However, in this case it is very important that the encryption key is not stored in the code itself as the code can be decompiled.

Symmetric vs Asymmetric Encryption:

Symmetric encryption uses a single key to encrypt and decrypt data. Asymmetric encryption uses 2 keys, a public key(for encryption) and private key(for decryption). Asymmetric encryption is a better option due to the involvement of 2 keys. Apps, especially critical ones(Medical/Banking) at no point of time should have the password visible on the phone. A use case can be: User enters the password on the mobile device, it is encrypted using the public key and sent to the server where the decryption happens using the private key(private key being stored on server). After the password is verified a token is sent back to the app to allow the user to login. This flow is essentially important for HIPAA compliant apps where a quick check for one of the HIPAA regulations is to put your device in flight mode and check if the user can still login. If the user can still login then probably the app is not HIPAA compliant.

Symmetric VS Asymmetric Algorithms

Symmetric VS Asymmetric Algorithms


On the apps which do not connect to a backend but yet require encrypted information to be stored, Android Keystore is a good place to store the public and private keys.





  • X items on the visible list means x items inflated/created means x times getView() gets called with convertView as null.
  • Items in recycler at this time: 0
  • User scrolls up.
  • One more time getView() gets called, 2 things happen.
  • getView() gets called with convertView as null. getView() inflates and returns the view that just became visible.
  • The view that just disappeared gets dumped in the recycler.
  • Total views : number of views on screen+1 in recycler.
  • User scrolls further, View recycling kicks in, convertView is no longer null.


This post outlines the decryption of the popular Whatsapp Crypt8 database file. To accomplish this we will need a few tools:

  • PC with either MAC or Windows installed. For this tutorial, I executed the steps on a Windows 7 system.
  • Cygwin(For Windows based system, With Mac you probably won’t need anything else).
  • Basic knowing of hexdump, openssl, gzip.
  • A rooted Android device.

Whatsapp stores the decryption key at location data/data/com.whatsapp/files/key on the phone. Extract this.

Whatsapp periodically backs up data on the SD-Card at sdcard/Whatsapp/Databases/msgstore.db.crypt8. Extract that.

Now, fire up the cygwin shell and run the following commands in the order mentioned below:

Firstly, extract the aes and the initialization vector,

#hexdump -e '2/1 "%02x"' key | cut -b 253-316 > aes.txt
#hexdump -e '2/1 "%02x"' key | cut -b 221-252 > iv.txt

Now strip down the 67 bytes header
#dd if=msgstore.db.crypt8 of=msgstore.db.crypt8.nohdr ibs=67 skip=1

Decrypt and convert to gzip
#openssl enc -aes-256-cbc -d -nosalt -nopad -bufsize 16384 -in msgstore.db.crypt8.nohdr -K aes.txt -iv iv.txt > msgstore.gz

Extract from Gzip
#gzip -cdq msgstore.gz > msgstore.db

And you are done. You can view the file now in Sqlite Browser.

Abstract Class Vs Interface

Yes, I do understand the title reflects the age old repetitive interview question. It has been asked again, again and again and then just some more times. But why, why is this so important. Why does every software company on the planet has the question sticking out of their heads. We all know the answer to this(duh??), they all know the answer to this(duh??), and nearly every preparation site answers this, then what is the big fuss about?

Well, the fantastic thing about an iceberg is its tip and then comes in the total size. In a very similar fashion, for this question, everybody understands the syntax but fails to understand the semantics. If you have been reading about this, few well know answers are:

  • Interface is a 100% abstract class
  • Java class(es) can implement multiple interfaces but it can extend only one abstract class.
  • Java interface should be implemented using keyword “implements”; A Java abstract class should be extended using keyword “extends”.

All the above arguments talk about the dynamics of implementations of an abstract class and interface(s). Although those are factually correct but lack in answering a fundamental design question, i.e., “When should my software use an abstract class and when should it use an interface”.

To answer this let us begin by asking a few questions, the first one is, Have you ever noticed that when we inherit from interfaces, we use “implements” however when we inherit from an abstract class we use “extends”?, Yes, No? Puzzled?

This little inheritance jargon is the key in understanding the usage of abstract classes vs interfaces. Not only in software engineering but in any engineering field(even in basic english), the keyword “implements” is associated with implementing a functionality and “extends” is associated with extending to enhance the characteristics of a type. Keeping this in mind, Whenever we are implementing we are actually defining “what” a class can do or what it is capable of. For example, a Ball which “is a” Toy is “capable of” bouncing. Got it?, Saw the light?, Not yet?

Maybe Now:

class Ball extends Toy implements Bounceable

So whenever we talk about functionality, think interfaces and whenever we talk about characteristics, think abstract classes(not always, but yes when we analyze in terms of extensions). So, as a general rule of thumb:


  • Talk about what a class can do.(Functionality)
  • Declare what kind of functionality, a class should implement.
  •  Uses “is capable of” relationship.

Abstract Class(es):

  • Talks about characteristics an object should support.
  • Type of an Object
  • Uses “is a” relationship

Hope this cleared some of the fog.